Dennis Ranke wrote:
>
> Hmm, I have now implemented AAN DCT and my decoder now takes on a 233Mhz
> StrongARM 2.9cs to decode a 17kb 240x160 Jpeg to 32768 colours with
> ordered dithering. This speed comes quite near to another jpeg
> implemenation on this computer which i believe is quite good optimized.
> When I take only the difference in clock speed into account, it would
> take about 0.4s on gba.
> This is of course somewhat different to your figures, therefore i am
> wondering if you just stated the speed of your dct core or that of
> the complete decoder.
Nope, I meant the entire kaboodle. Though my file format is not a true JPEG
compliant stream, I still feature all the essentials, but no dithering. I
keep quantisation down just sufficiently to make sure we never comprise the
destination colour range after colour space conversion.
As a point of reference regards the speed, the JPEG library that ARM sells
(circa $60,000 iirc) does about 0.55 MegaPixels on a 40Mhz ARM7. On a 200Mhz
StrongArm their decoder rates at 4.0MegaPixels which is roughly 9.6ms for
240x160 image and approx.
I'm guessing my coder clocks in faster than theirs purely down to the
shortcuts I've taken given the target platform display. If I supported the
full range of options, mine would be marginally slower than theirs right
now, but since I don't need to I'm not bothering. Ask yourself do you really
need all that extra fluff?
> My dct core seems to take about 0.3cs for the above picture, which means
> that there is certainly still a lot of room for optimisation in other
> areas...
An 8-point DCT cannot be done in less than 11 multiplies, it is possible to
arrange the computation so that many of the multiplies are simple scalings
of the final outputs. These multiplies can then be folded into the
multiplications by the JPEG quantization table entries. The AAN method
leaves only 5 multiplies and 29 adds to be done in the DCT itself.
A big boost can be achieved by short circuiting the DCT for any column that
has its AC terms at zero. This occurs at lot more frequently than might be
expected due to quantisation. If the AC term of any column is zero then the
output is equal to the DC coefficient, with scale factor as needed. You'll
find that in typical images half or more of the columns can be handled in
this fashion.
And once you do all that and are sitting back smiling thinking how cool is
this and maybe starting to think that maybe an MPEG player on that wee
little 16Mhz ARM7 wouldn't be so impossible after all, go lookup Fast DCT
'Approximations'. There you'll discover a whole new avenue of DCTs that use
*ZERO* multiplies, just shifts and adds and go very, very, very fast :)
Hi, I remember reading an e-mail about image compression on the gba. I will share the alternatives I found regarding image compression. I just discovered the...
Vince VE
vince_0x0f@...
Oct 2, 2000 10:26 pm
On Mon, 2 Oct 2000, Vince VE wrote: <snip> ... Well, if you're looking for a technical challenge, at http://www.ffd2.com/fridge/chacking/c=hacking19.txt you'll...
Anders M Montonen
ammonton@...
Oct 3, 2000 12:07 pm
In message <Pine.OSF.4.20.0010031456570.2147-100000@...> ... Hm, since this is a gba mailing list, i assume you are talking about the gameboy...
Dennis Ranke
exoticorn@...
Oct 3, 2000 1:02 pm
... Oops. I had a serious eye-brain-disconnection, and thought the mail was from the gb-dev list. Of course the GBA can use JPEGs! Sorry everyone, my bad. -a...
Anders M Montonen
ammonton@...
Oct 3, 2000 1:12 pm
... Go for the AAN DCT.. Using that I'm currently at 0.31 MegaPixels per second (about 8fps) using 8:1 compressed images. That's still in a complete C++ ...
Andy Mucho
andy@...
Oct 3, 2000 1:31 pm
In message <EHEOLOLLINEMBNHDJDFLOEDDHEAA.andy@...> ... Yes, that's what i will try to implement today. Thanks to you i now also have some...
Dennis Ranke
exoticorn@...
Oct 3, 2000 1:49 pm
In message <EHEOLOLLINEMBNHDJDFLOEDDHEAA.andy@...> ... Hmm, I have now implemented AAN DCT and my decoder now takes on a 233Mhz StrongARM...
Dennis Ranke
exoticorn@...
Oct 3, 2000 8:44 pm
... Nope, I meant the entire kaboodle. Though my file format is not a true JPEG compliant stream, I still feature all the essentials, but no dithering. I keep...
Andy Mucho
andy@...
Oct 3, 2000 10:11 pm
In message <EHEOLOLLINEMBNHDJDFLEEFKHEAA.andy@...> ... Hm, that's impressive. That means that i have still some things to do... ... Hm, in...
Dennis Ranke
exoticorn@...
Oct 4, 2000 3:45 pm
... why 'impractical for real use' ? more like 'the most decent image compression out there' ... (yes, it works like a charm on c64 ;=P) -- ... /^\ \ / ASCII...
Groepaz
groepaz@...
Oct 3, 2000 12:11 pm
Anyone know the MIPS usage of an MP audio stream? I downloaded a DOS compressor/decompressor and it was a total beast of a program. That psycho-acoustic model...
Sean Dunlevy
sean@...
Oct 3, 2000 1:36 pm
... Expect on the ARM7 architecture a full 48K 128Kbps Layer3 stream to require close to 30MHZ.. Obviously pretty useless for the GBA's underpowered CPU.. So...
Andy Mucho
andy@...
Oct 3, 2000 2:15 pm
Useful to know. So, it should be possible to squeeze a low-rate stream out of the thing. I suppose 32KHz is fine considering the DAC & op-amp isn't likely to ...
Sean Dunlevy
sean@...
Oct 3, 2000 2:20 pm
Why use any alternatives to the compression built into the hardware? I would think those would be the fastest to decompress. Doesn't the hardware support...
Joshua Meeds
dreamer@...
Oct 4, 2000 12:15 am
Hi, No you are quite correct. In fact I store my images using LZ77 compression but no Huffman so that you can decompress straight into VRAM. If you use ...