You already get a reduction by 12.5% if you just store 7 bits instead of 8 bits per character, with no tokenization at all. Whatever your tool attempts as "compression", a result of 13.6% is utterly below par.HarryP2 wrote:I managed to compress the text of one of my text adventures by 13.6%, and I still have room for a few more tokens.
Shorten text on a cc65 program for Vic20?
Moderator: Moderators
- Mike
- Herr VC
- Posts: 4856
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
Re: Exchange of compression ideas
Re: Exchange of compression ideas
Mike, you're right. 13.6% is nothing. Huffman Codes is said to produce a much better compression ratio. I am planning the necessary code now. My technique as is is designed to be cheap and easy to implement, so that is at least something. Try one of the techniques mentioned here!
Re: Exchange of compression ideas
The 13.6% I got was from using Tokenization to manually compress repeated strings. If I produce a tool to compress better, I should do much better.
- Mike
- Herr VC
- Posts: 4856
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
Re: Exchange of compression ideas
It is not my business to do your homework.HarryP2 wrote:Try one of the techniques mentioned here!
I have implemented several working compression algorithms over time, for text, for graphics, etc. with different grades of algorithmic complexity. I already mentioned one of those algorithms, which compresses most English texts by 30% with a static 5-bit alphabet. Simple, but completely sufficient for the task at hand.
You should probably ask yourself "Why bother?" when the uncompressed text body plus program code still fits into available memory.Right now, I'm especially looking for something which requires little work.
Re: Exchange of compression ideas
You're right, and I'm sorry. About the 5-bits-per-field compression scheme: I can't use it, as I need room for lower-case and numbers as well. Again, I am planning to create a cc65 text compressor, and I have a plan in mind. And about the homework comment: I believe my suggestions are good, and I ask you to try them out. I am working on them, and I am doing exceptional with it right now, but I suspect a bug. I want to see others use my ideas. Again, try them out!
- Mike
- Herr VC
- Posts: 4856
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
Re: Exchange of compression ideas
Earlier in this thread, I also mentioned the feature of the algorithm being "binary transparent", which you probably failed to notice.HarryP2 wrote:About the 5-bits-per-field compression scheme: I can't use it, as I need room for lower-case and numbers as well.
Said algorithm is capable to handle any type of data, and in case of text data with likely many characters in the 5-bit alphabet (lower case letters, space and punctuation), these are encoded more efficiently.
Frankly put - no, that's entirely your job.I believe my suggestions are good, and I ask you to try them out.