When you abbreviate or use coded language to shorten the original text, you are “compressing text.” Computers do this too, in order to save time and space. The art and science of compression is about figuring out how to represent the SAME DATA with FEWER BITS.

What makes doing this compression hard? You can start in lots of different ways. Early choices affect later ones. Once you find one set of patterns, others emerge. There is a tipping point: the dictionary starts to get so big that you lose the benefit of having it. But then you might start re-thinking the dictionary to tweak some bits out.

Do we think that these compression amounts that we’ve found are the best? Is there a way to know what the best compression is? We don’t know! There are so many possibilities it’s hard to know- The “best” is really just the best we’ve found so far.

Important Vocabulary

**Compress**: to decrease the number of bits used to represent a piece of information

**Heuristic** – a problem solving approach (algorithm) to find a satisfactory solution where finding an optimal or exact solution is impractical or impossible.

**Lossless Compression** – a data compression algorithm that allows the original data to be perfectly reconstructed from the compressed data

For homework, please complete questions 4-7 on Code.org