Developing application software for Xilinx AXI DMA

แชร์
ฝัง
  • เผยแพร่เมื่อ 29 ก.พ. 2020
  • #XilinxAXIDMA, #CacheCoherency
    This Video demonstrate the application software development for Xilinx AXI DMA controller, discusses the issue of cache coherency
    Source code
    github.com/vipinkmenon/xilinx...

ความคิดเห็น • 90

  • @connectme2karthik
    @connectme2karthik 3 ปีที่แล้ว +5

    Thanks for this wonderful tutorial, so useful to learn the DMA data flow. Never attended such a practical example anywhere. Good work and expecting more like this.

  • @ehsanjokar1623
    @ehsanjokar1623 2 ปีที่แล้ว

    Thanks for sharing this great tutorial. Without any doubt, this series is one of the best ones regarding DAMA and DDR. It would be great if you also provide a video regarding Ethernet.

  • @kavinduvindikasomadasa352
    @kavinduvindikasomadasa352 3 ปีที่แล้ว +1

    It's great tutorial...please keep up the good work. Excellent job !!!

  • @alexisvanbaelen9717
    @alexisvanbaelen9717 3 ปีที่แล้ว

    Big thanks for the very instructive videos appreciate a lot!

  • @hotshot365
    @hotshot365 2 ปีที่แล้ว

    Wow this is exactly what I needed. Thank you so much

  • @GiuseppeFerraro-z7giuseppe7z
    @GiuseppeFerraro-z7giuseppe7z ปีที่แล้ว

    wonderful tutorial, congratulations for the explanation

  • @alexsagi3937
    @alexsagi3937 5 หลายเดือนก่อน

    Excellent work !🤟

  • @mdesm2005
    @mdesm2005 2 ปีที่แล้ว

    Excellent video. Thanks

  • @lorazpam6277
    @lorazpam6277 2 ปีที่แล้ว

    Thanks for your great video bro

  • @reubengoh8056
    @reubengoh8056 4 หลายเดือนก่อน

    Thank you sir for this tutorial, I have learnt a lot from you :)

  • @tomyproconsul
    @tomyproconsul ปีที่แล้ว +1

    Around 1:09:44 where you mention that the first data is not received correctly is I think because the a and b arrays are not allocated on cache line boundaries. I think this problem would go away if you would have used the memalign function to allocate memory for the arrays.
    And of course thank you for this comprehensive tutorial about the DMA, it is a very rare and valuable resource. Your explanations helped me immeasurably!

    • @burgerking220
      @burgerking220 11 หลายเดือนก่อน

      This is why

  • @petersvideofile
    @petersvideofile 3 ปีที่แล้ว

    Awesome video! it really helped me understand how to setup the debugger :) Two things I noticed that I wanted to mention are 1) I think you need to Invalidate cache on the "b" variable after the dma completes. This is I think why the first element in the "b" array was invalid the first time you ran the program but not the second time, as you didn't initialize "b" with 0 values. I'm not sure the command required to invaidate the cache though, perhaps its "Xil_DCacheInvalidateRange". 2) I'm not so familiar with your HSL design but I think it may not be sufficient to just delay the TREADY

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว

      Yes. You can invalidate using Xil_DCacheInvalidateRange. Since I uploaded this a long time ago, I forgot where is this TREAD, TVALID issue is coming?

    • @mdesm2005
      @mdesm2005 2 ปีที่แล้ว

      @@TheVipinkmenon When I place b on the stack (automatic variable), only the last two of 8 locations were printed correctly (starting from a power up). In the second run (no power cycle), all values printed correctly. When I place b on the heap (static keyword), b was never updated. Even if I used Xil_DCacheInvalidateRange((UINTPTR)b, sizeof(b)); . When made b just a pointer to somewhere in memory (based on an Xilinx xaxidma_example) and used Xil_DCacheInvalidateRange((UINTPTR)b, 32); Finally I got the expected result (even from power up, every time). It seems important to always start from power up to avoid benefiting, somehow, from the results of a previous run. BTW, I used an AXI FIFO instead of an inverter. Did you cover caches somewhere?

  • @yadukrishnans5268
    @yadukrishnans5268 3 ปีที่แล้ว +1

    Thanx for this Tutorial..I didn't had any idea on DMA and the previous video and this one actually helped me very much in doing DMA Interface.You really have a good skill in these tutorial videos.Usually I never watch a 1 hour tutorial video(I'm bit lazy), but you made me sit and listen this completely.Thanx a lot

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว

      Thank u for the kind words

    • @yadukrishnans5268
      @yadukrishnans5268 3 ปีที่แล้ว

      @@TheVipinkmenon I have a doubt. Is this code runing on petalinux or a standalone??

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว

      standalone

    • @yadukrishnans5268
      @yadukrishnans5268 3 ปีที่แล้ว

      If I want to make a code working on petalinux, what all changes am I supposed to make??

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว +1

      Check this lauri.xn--vsandi-pxa.com/hdl/zynq/xilinx-dma.html.

  • @benguan8634
    @benguan8634 ปีที่แล้ว

    best video!

  • @consciousart1
    @consciousart1 2 ปีที่แล้ว

    Extremely beautiful and helpful lesson, thank you for your efforts. If you ever visit Turkey (Ankara or İstanbul) please add a comment here and I will connect you for a lunch or dinner. Thank you very much. Especially Xil_DCacheFlushRange function explaination will be extremely helpful for us developing bugfree professional applications.

  • @madhurjuneja6436
    @madhurjuneja6436 3 ปีที่แล้ว

    Hello Vipin, Thankyou for the detailed video.
    Just a small query, the way we are using simple data transfer, is there any function to do customized addressing while writing to DDR ?
    I mean instead of writing sequentially in consecutive location, we can provide an address offset for consecutive data points.

  • @johnbagshaw4704
    @johnbagshaw4704 3 ปีที่แล้ว

    Thanks for the video, Vipin. Please, how do I read binary files from the PS to DMA and from the DMA to a customized IP on the PL for a Zynq board?

  • @radhakrishnaganti7125
    @radhakrishnaganti7125 3 ปีที่แล้ว

    Hi, Can you please post the link for the previous video where in you generate the Verilog application and the connectivity.

  •  8 หลายเดือนก่อน

    I am not able to send more than 8 elements of the array at once, do I need to change the settings on my DMA in vivado?

  • @prasannan2084
    @prasannan2084 3 ปีที่แล้ว

    what are the changes needed to read and write camera stream in DMA?

  • @mohamednaeim2838
    @mohamednaeim2838 3 ปีที่แล้ว

    Thank you for this tutorial, this is really helpful.
    However, I have a strange problem, on Vivado hardware manager and Ila, everything is working perfect I can read the inputs correctly and also see the correct outputs. But on the Vitis_serial_terminal , status before data transfer = 0 and status after data transfer = 0 also.. Moreover, when I print the xil_printf("%x
    ",b[i]); I only can see zeros in the Vitis_Serial_Terminal I don't know the reason for that but on the Vivado hardware side everything is working perfectly.
    - Vivado19.2
    -Vitis 19.2

  • @mdesm2005
    @mdesm2005 2 ปีที่แล้ว

    A 21:36, sizeof(a) would have worked, w/o being divided by sizeof(U32). Wikipeida: "When sizeof is applied to the name of an array, the result is the number of bytes required to store the entire array". It's interesting that the comment for the function says "* @param Length is the length of the transfer" w/o mentionning the units, which apparently is 'bytes' At 23:04, maybe casting the array name 'a' with (UINTPTR) , as done in the Xilinx example, would be better. Notice that "PTR" stands for pointer. Otherwise, I would expect a cast like (u32 *) not (u32) to cast a pointer.At 57:45, using a ZC702, i get IDLE=0,HALTED=1 after init. IDLE=0,HALTED=0 during transfer. IDLE=1,HALTED=0 after the transfer.

  • @user-ro8jz8eu7e
    @user-ro8jz8eu7e 2 หลายเดือนก่อน

    Thank you for this tutorial this was great! One question, if I have a FIFO instead of the inverter, does the Software code works the same? Thank you!

  • @pavankumarg4895
    @pavankumarg4895 ปีที่แล้ว

    In the waveforms at 1:09:12, I can see that the slot1 TVALID is low for 2 cycles. What is the reason for this? Can it be avoided? In this application TVALID becoming low is not an issue, but I have another application where I want 100% throughput across AXI streaming interface, but whereas TVALID becomes LOW in between. Any thing that I can do here?

  • @vw8611
    @vw8611 2 ปีที่แล้ว

    Hi Vipin, thank you very much for this awesome tutorial. I am trying to send an array with length 100.000 but the DMA initialization failes. Could you please give me some advice on how to go about sending large amounts of data through the streaming interface?

  • @brendamarianahernandez1761
    @brendamarianahernandez1761 2 ปีที่แล้ว

    Hello! First, thank you very much for all the series of videos. It were really helpful.
    I am trying to connect a program in VHDL as a module directly to the Custom IP input and then to the DMA (like you explained in the videos but onlu usin M2S in the custom IP, because the other interface data is obtained for the module in VHDL I mentioned). The thing is that I don't know how and where to read the data written in the DMA using the C program in my Vitis project. Can you help me with this issue? Thank you very much again.

  • @van-dungpham3699
    @van-dungpham3699 3 ปีที่แล้ว +3

    Thank professor for your wonderful tutorial. I would like to know each IP of Xilinx has a corresponded driver and example in C/C++, but where can we find the hardware design example?

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว +2

      Example may not be available for all IPs but for many IPs if you right click the IP in the block design there will be an option open example design.

    • @van-dungpham3699
      @van-dungpham3699 3 ปีที่แล้ว +1

      @@TheVipinkmenon
      Dear Professor, I saw in this video, you export hardware without including BIT stream.
      If I select the option which includes BIT stream, I would like to know which running configurations in SDK that I have to use to debug the application with ARM core in Zynq Ultrascale using ILA Debug core?
      1.Reset entire system
      2.Reset APU
      3.Reset RPU
      4.Enable RPU
      5.Program FPGA
      6.Run psu_init
      7.PL Powerup
      Thank you very much for your help

  • @prathoshshastry9397
    @prathoshshastry9397 4 ปีที่แล้ว +2

    Hello Vipin, Thanks for the tutorials on AXI peripherals and are very helpful.
    I have written an algorithm in VHDL, now I want to build custom IP and try to access that custom IP from SDK. I am new to this FPGA designing, I am confused and not getting the correct documentation on considerations to build custom IP. Can you suggest me any video, documentation, or anything which may help me?

    • @TheVipinkmenon
      @TheVipinkmenon  4 ปีที่แล้ว +1

      Hardware part : th-cam.com/video/I0eu_Y3pMmM/w-d-xo.html software part: th-cam.com/video/U-75MjbZyJE/w-d-xo.html

  • @8281samrat
    @8281samrat 8 หลายเดือนก่อน

    Why do you receive data like 1,2,3,4 in processor through UART?
    We should get values like FFFFFFFF, FFFFFFE, FFFFFFD, .. as the inversion operation is performed and data is loopbacked to the DMA.

  • @amud234
    @amud234 4 ปีที่แล้ว +2

    Hello sir, i am using a zybo board for the project. I did each and every step mentioned in the video but when I try to print anything in SDK, it shows nothing through the COM port. Could you help me with this please?

    • @TheVipinkmenon
      @TheVipinkmenon  4 ปีที่แล้ว

      Did u finish hardware development? Zybo is not by default listed in Vivado supported boards. Did you add the board configuration to Vivado or did you do it with Zedboard configuration?

    • @amud234
      @amud234 4 ปีที่แล้ว

      @@TheVipinkmenon I have downloaded the relevant board packages to enable the board and then did it. As a matter of fact, even hello world application is not displaying

    • @TheVipinkmenon
      @TheVipinkmenon  4 ปีที่แล้ว +1

      I see. Make sure and double check correct COM port is chosen for the serial interface and under run configurations under application tab you have checked the processor and elf file is present under application.

  • @kavinduvindikasomadasa352
    @kavinduvindikasomadasa352 3 ปีที่แล้ว

    Hello sir, I'm trying to transfer data from APU to FPGA using an AXI DMA. I'm using register mode. Once I do the first transfer using simpleTransfer() function, Halted bit stays at 0 and Idle bit becomes 1. Now the 1st transfer is successfully completed and I can see the TLAST signal of M_AXIS_MM2S interface is also asserted(in ILA waveforms). But when it executes the 2nd transfer using simpleTransfer() function, it won't send the data to the FPGA. I tried to trigger the ILA when data of the 2nd transfer comes, but it doesn't send any of the 2nd transfer data. But when I check the XSCT debugger, register space of the AXI DMA is changed it's destination address and transfer length relevant to 2nd transfer. I believe this results in having halted bit 0 after the 1st transfer,so my question is how can I use the axi dma in register mode to transfer the 2nd data block. Do I need to stop the AXI DMA by writing to the RS(run/stop) field of control register? It would great help, if you can answer, thank you.

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว

      For your first transfer, all status register bits are behaving as expected (Idle=1, halt=0 that means RS might be 1). In the second case I didn't fully understand the value in halt. If it is 1, that means some error happened during transfer and it was halted and RS bit is automatically cleared. Try checking MM2S_STATUS register to find any error is happening. If you cannot find the reason/correct the error, one way is to apply a soft reset using the control register, then set the RS bit and then again call the simple data transfer function

  • @Andrew-eg2pc
    @Andrew-eg2pc 3 ปีที่แล้ว

    Does it take a long time to generate the bitstream? It seems that my vivado is getting stuck in systhesis stage for 5 hours now and not making any progress.

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว

      Depends on the system configuration. But if you are trying for this design, 5 hrs is too much. Pls check how much memory Vivado is taking. You are supposed to have min 8GB RAM for Vivado. Also try to increase virtual memory size.

  • @eceupskill5400
    @eceupskill5400 ปีที่แล้ว

    Hi, you said that any variable declared in the code is automatically stored to DDR and that it depends on some settings. What settings is that and how can I know that the memory where the variable will be stored will be in DDR? Thanks

    • @betechiZ
      @betechiZ 10 หลายเดือนก่อน

      Go to linker file ..there is a settings where code is stored in DDR by default

  • @alexandrosiii5676
    @alexandrosiii5676 3 ปีที่แล้ว

    Can I ask if the time to complete the above math can be viewed in ILA?

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว

      Yes. If you really want to find it that way. Or you can use the timers/counters inside the PS to do it. Look at the comment section of this vide. th-cam.com/video/HXUB2Lymguc/w-d-xo.html I have shown how to use the timers.

  • @kavinduvindikasomadasa352
    @kavinduvindikasomadasa352 3 ปีที่แล้ว

    I checked the ILA and transfering happens using axi dma simple transfer....but when I use the b array to get the receiving data from the DMA to memory, it shows all the values as 0 when I print them. Can you please tell me any reason for this issue?

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว

      Should be due to cache. Check xil dcache invalidate

    • @kavinduvindikasomadasa352
      @kavinduvindikasomadasa352 3 ปีที่แล้ว +1

      @@TheVipinkmenon thanks for answering... it works fine now....

  • @jrtrojancoc5336
    @jrtrojancoc5336 3 ปีที่แล้ว

    can you explain interrupt handling as well.

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว

      Pls check videos 27 and 28. Also the image processing series in this same playlist uses interrupts extensively.

  • @TheDogWithTheMan
    @TheDogWithTheMan 2 ปีที่แล้ว

    is the caching issue also true for the other direction? i.e. if DMA from Periph to DDR, then does the DMA data get cached before going into the ddr? so in this scenario, do i still need to cache flush to get the correct data from the DDR?

    • @TheVipinkmenon
      @TheVipinkmenon  2 ปีที่แล้ว +1

      If it is between peripheral to DDR there won't be any caching issue. Cache is between processor and DDR. So if data is first written by the processor to DDR and then it is DMAed to peripheral, need to flush the cache before inviting dma. Similarly before processor reads data from DDR, which was DMAed from peripheral cache should be invalidated.

    • @TheDogWithTheMan
      @TheDogWithTheMan 2 ปีที่แล้ว

      ​@@TheVipinkmenon"Similarly before processor reads data from DDR, which was DMAed from peripheral cache should be invalidated." I dont understand why flush is necessary in this case, if the PS did not read this memory, hence nothing cached (or if we explicitly flush dcache prior to dma transfer). we perform a DMA transfer from the periph to DDR, then the PS performs a read to DDR, why is a flush necessary here?

    • @TheDogWithTheMan
      @TheDogWithTheMan 2 ปีที่แล้ว

      @@TheVipinkmenon I'm wondering if the answer to my question is cpu pre-fetching.. which would explain why I would need to invalidate cache even for the other direction.

    • @TheVipinkmenon
      @TheVipinkmenon  2 ปีที่แล้ว

      Assume you r sending a video frame by frame to a buffer and processor is reading and processing it. So there is a chance that the video buffer is cached. So it will be better to invalidate cache before reading next frame from the DDR buffer

  • @MukeshGhosh123
    @MukeshGhosh123 2 ปีที่แล้ว

    Hello Vipin,
    Can we transfer float value using AxiDMA Simple Transfer? I am confused about that. Because most of the axi dma examples I saw online they transferred integer value or u32 value. If you have any idea on that please give small feedback. Thanks in advance.
    Kind regards
    Mukesh Ghosh

    • @TheVipinkmenon
      @TheVipinkmenon  2 ปีที่แล้ว +1

      AXI doesn't care the data type. It simply transfers bits stored in memory. So if you store floating point array in memory and call DMA transfer, it will with start address and length, it will transfer that much data. Only thing to remember, in the memory float will be stored following IEEE format. So it is up to your hardware to interpret the bitpattern correctly. If you are transferring device DMA, it is the same. Device should be provided floating point data in IEEE format, that will get stored in the given buffer. Then you can use a float type casted pointer from the software to access it from the memory.

    • @MukeshGhosh123
      @MukeshGhosh123 2 ปีที่แล้ว

      @@TheVipinkmenon
      Hello Vipin,
      Thanks for the reply. Yes, I understood it and my problem is solved. But I also trying to load two axi input stream to my IP block for subtraction purpose. When I transfer two-stream, using two different axi dma block to my IP block, I am getting some garbage value as a output. which is unexpected.
      1.Is it possible to transfer two axi input stream concurrently?
      2. Do you know any example where they used to transfer two axi input stream for algebric operation?.
      If you know, please let me reply. Thanks in advance.
      Kind regards
      Mukesh Ghosh

  • @alexandrosiii5676
    @alexandrosiii5676 3 ปีที่แล้ว

    Is this part the same as transferring data from ps to pl? (I'm a beginner)

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว

      It is from DDR to PL through axi HP port

    • @alexandrosiii5676
      @alexandrosiii5676 3 ปีที่แล้ว

      @@TheVipinkmenon so if in case I want to transfer data from ps to pl can I do this? First I send the data to the DDR then put it through the processing chip and then return to ram, then do what you are doing is switch from DDR to PL and now I can get the data on the screen by giving led?

  • @alexandrosiii5676
    @alexandrosiii5676 3 ปีที่แล้ว

    Can I ask System ILA to work with AXI4-LITE?

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว

      Of course. If you are doing block design, just right click the axi lite interface and choose debug. It will automatically instantiate the ILA controller and connect all required signals.

  •  4 ปีที่แล้ว

    Great video! Keep it up Would you like to be TH-cam friends? :)

  • @najrul095
    @najrul095 3 ปีที่แล้ว

    On zynq ultrascale+ ZCU102,
    I tried exactly the same design, even tried copy-pasting the code from your GitHub link, but when printing the contents of 'b', it is giving wrong values.
    This is what it is printing when I run your code.
    DMA initialization success..
    Status before data transfer 1
    DMA transfer success..
    0
    0
    0
    0
    11110
    0
    3688
    0
    And if sleep it with sleep(1) or even sleep(50), it prints
    0
    0
    0
    0
    12110
    0
    37A0
    0
    But The Output shown in ILA is correct, I mean inverters output is coming as 255-input in ila,
    But why is not it printing it properly in terminal

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว +1

      Please try cache invalidate before printing the values after DMA

    • @najrul095
      @najrul095 3 ปีที่แล้ว

      @@TheVipinkmenon
      Thanks a lot, sir for replying.
      I tried in this way also, it prints some different value
      0
      0
      0
      0
      11110
      0
      2FA0
      0
      But, in all cases, ILA is showing correct values,.

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว

      Please try 2 invalidates before DMA and after DMA
      (www.xilinx.com/support/answers/64839.html)

    • @mdesm2005
      @mdesm2005 2 ปีที่แล้ว

      @@TheVipinkmenon for some reason, this link includes a bracket on the right side that needs to be removed

    • @burgerking220
      @burgerking220 ปีที่แล้ว

      Align it

  • @techtronic2283
    @techtronic2283 3 ปีที่แล้ว

    first I program the FPGA using vivado, but then I run the application program from SDK...but it doesn't show the printed messeges in SDK terminal? Can you tell me a way to find the issue here?

    • @walimunisomadasa4032
      @walimunisomadasa4032 3 ปีที่แล้ว

      I got the same issue and stuck there for hours...can you please explain that?

    • @techtronic2283
      @techtronic2283 3 ปีที่แล้ว

      @@walimunisomadasa4032 I found that in XAxiDma_LookupConfigBaseAdder function...function argument was in UINTPTR (but in the above video, it's u32) could that change be the issue?....and here I'm using vivado 18.3 version

    • @TheVipinkmenon
      @TheVipinkmenon  3 ปีที่แล้ว

      Just hello world print is working?

    • @kavinduvindikasomadasa352
      @kavinduvindikasomadasa352 3 ปีที่แล้ว

      @@walimunisomadasa4032 in vivado 2018 version, in run configuration you have to tick PL Powerup after FPGA programming....it needs to be ticked when you run the application for the first time using run configurations menu. I believe PL Powerup is similar to PS7_post_config which enables the level shifters from PL to PS.