How to know if HW/SW codesign will be useful for a specific application?
I will be in my final year (Electrical and Computer Engineering )the next semester and I am searching for a graduation project in embedded systems or hardware design . My professor advised me to search for a current system and try to impro开发者_如何学Pythonve it using hardware/software codesign and he gave me an example of the "Automated License Plate Recognition system" where I can use dedicated hardware by VHDL or verilog to make the system perform better .
I have searched a bit and found some youtube videos that are showing the system working ok .
So I don't know if there is any room of improvement . How to know if certain algorithms or systems are slow and can benefit from codesign ?
How to know if certain algorithms or systems are slow and can benefit from codesign ?
In many cases, this is an architectural question that is only answered with large amounts of experience or even larger amounts of system modeling and analysis. In other cases, 5 minutes on the back of an envelop could show you a specialized co-processor adds weeks of work but no performance improvement.
An example of a hard case is any modern mobile phone processor. Take a look at the TI OMAP5430. Notice it has a least 10 processors, of varying types(the PowerVR block alone has multiple execution units) and dozens of full-custom peripherals. Anytime you wish to offload something from the 'main' CPUs, there is a potential bus bandwidth/silicon area/time-to-market cost that has to be considered.
An easy case would be something like what your professor mentioned. A DSP/GPU/FPGA will perform image processing tasks, like 2D convolution, orders of magnitude faster than a CPU. But 'housekeeping' tasks like file-management are not something one would tackle with an FPGA.
In your case, I don't think that your professor expects you to do something 'real'. I think what he's looking for is your understanding of what CPUs/GPUs/DSPs are good at, and what custom hardware is good at. You may wish to look for an interesting niche problem, such as those in bioinformatics.
I don't know what codesign is, but I did some verilog before; I think simple image (or signal) processing tasks are good candidates for such embedded systems, because many times they involve real time processing of massive loads of data (preferably SIMD operations).
Image processing tasks often look easy, because our brain does mind-bogglingly complex processing for us, but actually they are very challenging. I think this challenge is what's important, not if such a system were implemented before. I would go with implementing Hough transform (first for lines and circles, than the generalized one - it's considered a slow algorithm in image processing) and do some realtime segmentation. I'm sure it will be a challenging task as it evolves.
First thing to do when partitioning is to look at the dataflows. Draw a block diagram of where each of the "subalgorithms" fits, along with the data going in and out. Anytime you have to move large amounts of data from one domain to another, start looking to move part of the problem to the other side of the split.
For example, consider an image processing pipeline which does an edge-detect followed by a compare with threshold, then some more processing. The output of the edge-detect will be (say) 16-bit signed values, one for each pixel. The final output is a binary image (a bit set indicates where the "significant" edges are).
One (obviously naive, but it makes the point) implementation might be to do the edge detect in hardware, ship the edge image to software and then threshold it. That involves shipping a whole image of 16-bit values "across the divide".
Better, do the threshold in hardware also. Then you can shift 8 "1-bit-pixels"/byte. (Or even run length encode it).
Once you have a sensible bandwidth partition, you have to find out if the blocks that fit in each domain are a good fit for that domain, or maybe consider a different partition.
I would add that in general, HW/SW codesign is useful when it reduces cost.
There are 2 major cost factors in embedded systems:
- development cost
- production cost
The higher is your production volume, the more important is the production cost, and development cost becomes less important.
Today it is harder to develop hardware than software. That means that development cost of codesign-solution will be higher today. That means that it is useful mostly for high-volume production. However, you need FPGAs (or similar) to do codesign today, and they cost a lot.
That means that codesign is useful when cost of necessary FPGA will be lower than an existing solution for your type of problem (CPU, GPU, DSP, etc), assuming both solutions meet your other requirements. And that will be the case (mostly) for high-performance systems, because FPGAs are costly today.
So, basically you will want to codesign your system if it will be produced in high volumes and it is a high-performance device.
This is a bit simplified and might become false in a decade or so. There is an ongoing research on HW/SW synthesis from high-level specifications + FPGA prices are falling. That means that in a decade or so codesign might become useful for most of embedded systems.
Any project you end up doing, my suggestion would be to make a software version and a hardware version of the algorithm to do performance comparison. You can also do a comparison on development time etc. This will make your project a lot more scientific and helpful for everyone else, should you choose to publish anything. Blindly thinking hardware is faster than software is not a good idea, so profiling is important.
精彩评论