Run-Length Encoding (RLE) is a simple compression technique that replaces consecutive repeating characters with a count and the character itself. A common way to represent this is using frequency/data pairs.
Consider the following string: "AAABBBCCCDDDE". Using RLE frequency/data pairs, we can represent this as:
Here's a basic implementation of RLE in code:
def encode(data):
encoded = ""
count = 1
for i in range(1, len(data)):
if data[i] == data[i-1]:
count += 1
else:
encoded += str(count) + data[i-1]
count = 1
encoded += str(count) + data[-1]
return encoded
def decode(data):
decoded = ""
i = 0
while i < len(data):
count = ""
while data[i].isdigit():
count += data[i]
i += 1
decoded += int(count) * data[i]
i += 1
return decoded
# Example usage
data = "AAABBBCCCDDDE"
encoded_data = encode(data)
print("Encoded:", encoded_data)
decoded_data = decode(encoded_data)
print("Decoded:", decoded_data)
encode(data)
function:encoded
to an empty string.count
to 1 to track the repetition count.data
string, comparing adjacent characters.count
is incremented.count
and the previous character are appended to encoded
as a frequency/data pair, and count
is reset to 1.After the loop, the final count
and the last character are appended to encoded
.
decode(data)
function:
decoded
to an empty string.data
string.data
by reading digits.count
times to decoded
.RLE using frequency/data pairs is a simple and useful compression technique. Its effectiveness depends on the nature of the data being compressed. However, for data with repeating sequences, RLE can provide a significant reduction in size, making it a valuable tool in data representation and storage.