Run-Length Encoding (RLE) is a simple compression technique that replaces consecutive repeating characters with a count and the character itself. A common way to represent this is using frequency/data pairs.
Consider the following string: “AAABBBCCCDDDE”. Using RLE frequency/data pairs, we can represent this as:
Here’s a basic implementation of RLE in code:
def encode(data):
encoded = ""
count = 1
for i in range(1, len(data)):
if data[i] == data[i-1]:
count += 1
else:
encoded += str(count) + data[i-1]
count = 1
encoded += str(count) + data[-1]
return encoded
def decode(data):
decoded = ""
i = 0
while i < len(data):
count = ""
while data[i].isdigit():
count += data[i]
i += 1
decoded += int(count) * data[i]
i += 1
return decoded
# Example usage
data = "AAABBBCCCDDDE"
encoded_data = encode(data)
print("Encoded:", encoded_data)
decoded_data = decode(encoded_data)
print("Decoded:", decoded_data)
encode(data) function:
* Initializes encoded to an empty string.
* Sets count to 1 to track the repetition count.
* Iterates through the data string, comparing adjacent characters.
* If characters are the same, count is incremented.
* If characters are different, the current count and the previous character are appended to encoded as a frequency/data pair, and count is reset to 1.
* After the loop, the final count and the last character are appended to encoded.
decode(data) function:
* Initializes decoded to an empty string.
* Iterates through the data string.
* Extracts the frequency (number) from data by reading digits.
* Appends the character repeated count times to decoded.
RLE using frequency/data pairs is a simple and useful compression technique. Its effectiveness depends on the nature of the data being compressed. However, for data with repeating sequences, RLE can provide a significant reduction in size, making it a valuable tool in data representation and storage.
Create a customised learning path powered by AI — stay focused, track progress, and earn certificates.
Build Your Learning Path →