MS SQL Server - how to ignore/ discard very rare duplicate rows during table insert -

April 15, 2015

i discovered error in client api. sends pages of unique records based on timestamps. every often, send same exact record on end of previous page, , start of next page.

i operating under assumption there no duplicates (this event log after all). therefore doing pure inserts, instead of going through trouble of update or sort of 'on duplicate' step.

my question --- what's best way rid of these duplicates, , have ms sql quietly discard these duplicate records upon insert? inserted after original has been written in transaction. these duplicates happen once every 10,000 rows -- memory light great.

the distinct part of records "etl batch id" - else identical. otherwise each record supposed have unique "event id". if make event id unique index/ key, can somehow force ms sql dump duplicate keys? or application throw error message , stop during query?

there infamous ignore_dup_key index option answers question. if try avoid using because changes insert semantics in subtle , hard discover way.

it efficient, though: http://blogs.msdn.com/b/craigfr/archive/2008/01/30/maintaining-unique-indexes-with-ignore-dup-key.aspx , http://sqlblog.com/blogs/paul_white/archive/2013/02/01/a-creative-use-of-ignore-dup-key.aspx.

if make event id unique index/ key, can somehow force ms sql dump duplicate keys? or application throw error message , stop during query?

yes, insert fail. preferred way in situations. if can't make work ignore_dup_key might next best thing.

Search This Blog

TSQL

MS SQL Server - how to ignore/ discard very rare duplicate rows during table insert -

Comments

Post a Comment

Popular posts from this blog

java - WARN : org.springframework.web.servlet.PageNotFound - No mapping found for HTTP request with URI [/board/] in DispatcherServlet with name 'appServlet' -

1111. appearing after print sequence - php -

android - How to create dynamically Fragment pager adapter -