开发者

What are best practices for optimizing pipeline throughput for fpga implementations?

How does one for example make the best use of retiming and/or c-slow to make the most of a given pipeline.

With retiming, some modules get better results by putting the shift registers on the inputs (forward register balancing), while other modules do better with shift registers on the output (backward register balancing).

For now I use the following method:

  • code hdl (in verilog)
  • create timing constraints for the specific module
  • synthesize, map, place & route (using ISE 13.1)
  • look at post place & route timings for the module-to-be-improved, and at the maximum number of logic levels.
  • take this number of logic levels, and make an educated guess for the number of flip-flops to insert.
  • insert flip-flops, enable register balancing, hope for the bes开发者_如何学编程t

As it stands, this method is hit & miss. Sometimes it gets pretty good results, sometimes it's crap. So, what is a good way to improve the success ratio of such retiming?

Are there any tools that can aid in this? Also, links, papers and book recommendations would be much appreciated.


Sounds like you have the right ideas. Tool-based retiming can be a bit hit-and-miss. Sometimes putting an extra 2 or 3 FFs above what you think will be good can help.

At the other extreme, when I need to push the performance to the limit, I have to balance the pipeline by hand. This can be a right pain, having to split your nicely readable HDL code into awful explicit logic and registers - but sometimes I find it does just have to be done :( Lots of comments required and a really good testbench to make sure you haven't broken it!

Finally, there is a "half-way house". If I look at the logic path which has the most logic levels and think a bit about the code, I often find that it's only one very small piece of code (maybe just one line). This can be pulled out without "harming" the readability of the rest of the module too badly. And sometimes pulling that code into its own entity and putting the extra flipflops in that entity enables the automatic rebalancer to work better.

Good luck!


Not a tool, but you may appreciate my blog entry on the the Art of High Performance FPGA Design. http://www.fpgacpu.org/log/aug02.html#art

Happy hacking!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜