Insert One Million Rows into SQL Server - How to Use SqlBulkCopy

แชร์
ฝัง
  • เผยแพร่เมื่อ 16 ธ.ค. 2024

ความคิดเห็น • 22

  • @paang8385
    @paang8385 3 ปีที่แล้ว +1

    If we do have any checking for duplication does the performance will be the same?

    • @seanmackenziedataengineering
      @seanmackenziedataengineering  3 ปีที่แล้ว

      If you need to check for duplicates, it is probably faster to create a staging table and check for duplicates, then add to your destination table. Good question!

  • @sharayuwagh2085
    @sharayuwagh2085 3 ปีที่แล้ว +1

    Need to insert list of dto objects in database using sqlbulkcopy....can u suggest how to do it

    • @seanmackenziedataengineering
      @seanmackenziedataengineering  3 ปีที่แล้ว

      Actually, I'm not sure on that one. Probably you would convert them to datatables and then use SqlBulkCopy. Though this might be less efficient during transfer.

  • @mehdisoleimanian5805
    @mehdisoleimanian5805 4 ปีที่แล้ว +1

    Thanx. I will compare one table with 5 tables and copy data in Destination Table if Not Existe Id. How Can I do that?

    • @seanmackenziedataengineering
      @seanmackenziedataengineering  4 ปีที่แล้ว

      It sounds like you'll compare with some other tables and only copy if some conditions are met. You can check conditions and make filtered data using DataViews, then use SqlBulkCopy to insert the result sets. See my earlier video on filtering and sorting DataTables using DataViews! Thanks for your question!

  • @user-rp9iis1en6h
    @user-rp9iis1en6h 4 ปีที่แล้ว +1

    I am already using bulkcopy for doing the insertion. But let's say there are 100M rows in a table and you want to delete 1M row from there. How you do it in a faster and optimal way? (we have the row ids) but deleting using the id one by one still needs a significant amount of time. Is there anything that does the delete operation like bulkcopy/bulkDelete?

    • @seanmackenziedataengineering
      @seanmackenziedataengineering  4 ปีที่แล้ว +1

      This is a great question. If you need to delete a large number of rows like this, and you have a list of row ID numbers like you said, then you can just use your SqlConnection to execute a single statement to delete the 1 million rows.
      If you do not have your row ID list (rows to delete) in a table in the same database yet, then you should push the list into a table (temporarily) on the server using SqlBulkCopy. It should not take much space since it is just a list of key ID numbers. For the same reason, it should copy quickly. Let the server do the work.
      After the key list is in a database table, you can use something like:
      Delete From MyBigTable Where ID In (Select ID From TempIDTable);
      Finally, drop the TempIDTable if you want to conserve space.

    • @user-rp9iis1en6h
      @user-rp9iis1en6h 4 ปีที่แล้ว

      @@seanmackenziedataengineering Yes I can follow this approach, but was looking for something more faster

  • @mittalpa
    @mittalpa 4 ปีที่แล้ว +1

    Very helpful article Sean.
    Is there a way to ensure all the rows were inserted successfully, other than counting after insertion?

    • @seanmackenziedataengineering
      @seanmackenziedataengineering  4 ปีที่แล้ว

      Thanks for your question! There is an option you can try in your SqlBulkCopy statement that will require it to use a transaction. This is called SqlBulkCopyOptions.UseInternalTransaction. This will make your insert into an "all or nothing" situation, which might better suit your needs. If any rows cannot be inserted due to errors, then no rows will be inserted. Good luck on your project!

  • @parasvillis5849
    @parasvillis5849 3 ปีที่แล้ว +1

    Can we use use this to update a column in a table? If yes, will it work with the same efficiency ?

    • @seanmackenziedataengineering
      @seanmackenziedataengineering  3 ปีที่แล้ว +1

      You can, with a few steps. Typically I would use sqlbulkcopy to push just the key and the update column(s) to a staging table, then, once the staging table is loaded, execute an update statement to update rows of your target table from the staging table. Drop the staging table when the operation is complete. It will be much faster than individual update statements!

  • @manadochannel6025
    @manadochannel6025 3 ปีที่แล้ว +1

    I've a question for you?
    Why sql server memory still growth up,
    When we use sqlbulkcopy
    Is there any solution to handle it?

    • @seanmackenziedataengineering
      @seanmackenziedataengineering  3 ปีที่แล้ว

      It depends on where you run your console app. If you run it on the same computer as the SQL Server, it will use memory. It is better to run it on a different computer than the server. If you are already using a different computer and it is still a problem, try using the BatchSize property, which will break up the job into smaller amounts.

  • @behappey8494
    @behappey8494 4 ปีที่แล้ว

    please what if i have tables and the tables it relation and it take forgain key of another table how i can deal with this and thanks for you sir

    • @seanmackenziedataengineering
      @seanmackenziedataengineering  4 ปีที่แล้ว

      You mean that the destination table has a foreign key constraint? If this is the case, you must make sure the parent table is populated first, with key entries that you can then use in your main insert statement.

  • @1990Ananth
    @1990Ananth 4 ปีที่แล้ว +1

    Thanks for the video Sean. Couple of quick questions. My end goal is to copy a table from one database to another database. These tables are located in the same server
    1) How do I differentiate between source and destination tables
    2) Could you explain how for loop works when I want to copy the same contents?

    • @seanmackenziedataengineering
      @seanmackenziedataengineering  4 ปีที่แล้ว

      Thanks for your question! For databases on the same server, you will not need to use SqlBulkCopy, since you can specify a single statement to the server that will copy the data much more efficiently. You will use something like:
      Insert Into DATABASETWO.dbo.MYTABLECOPY (COLUMN1, COLUMN2..) SELECT COLUMN1, COLUMN2 FROM DATABASEONE.dbo.MYTABLE
      You can also specify the server, ie. SERVER1.DATABASETWO.dbo.MYTABLECOPY
      Try using this method as it does not require the data to be pulled into client memory like it does with SqlBulkCopy, which is used more for network migrations (not on the same server).

    • @1990Ananth
      @1990Ananth 4 ปีที่แล้ว

      @@seanmackenziedataengineering Thanks for the reply, However we are looking to transfer close to 1B records and if we INSERT INTO Table, it would be a costly operation.
      Do you recommend that we generate a data out file and load it?

    • @seanmackenziedataengineering
      @seanmackenziedataengineering  4 ปีที่แล้ว

      @@1990Ananth I see. In this case I would still use an INSERT as mentioned, but write a chunking procedure to only insert a range of records each time. It is still a benefit that both databases are on the server. There is a nice technique here: stackoverflow.com/questions/864341/move-sql-server-data-in-limited-1000-row-chunks
      In your case, you would use the naming I mentioned above. Also, if you are not deleting, you can calculate the row numbers or ID sets to move each time.

    • @seanmackenziedataengineering
      @seanmackenziedataengineering  ปีที่แล้ว

      @Old Guy Adventures You can use the logic you see here. Use the two queries you have that split the table into two views. Use this logic to select from those queries and point the SqlBulkCopy to your new database and it will make two tables there. Good luck!