amazon web services - Expected behavior for AWS Kinesis ShardIteratorType TRIM

amazon web services - Expected behavior for AWS Kinesis ShardIteratorType TRIM_HORIZON -

September 15, 2012

context: i'm not referring kcl-based application, pure kinesis api calls.

does using trim_horizon shard iterator type give earliest published record in stream (ie earliest available within kinesis' built-in 24hr window), or iterator/cursor time period as 24 hours ago, must use advance along stream until hit earliest published record?

put way, in case that's not quite clear....

when using shard iterator type of trim_horizon, expected behavior begin returning records available 24 hours ago, if 0 records published 24 hours ago, , instead 3 hours ago, application need iteratively poll through previous 21 hours before reaches records published 3 hours ago?

timeline example:

sept 29 5:00 - create stream "foo" 1 shard
sept 29 5:02 - publish single record, "item=a", "foo" stream
sept 29 5:03 - issue getsharditerator call trim_horizon shard iterator type, issue getrecords call shard iterator , receive record "item=a"
sept 30 7:02 - publish second record, "item=b", "foo" stream
sept 30 7:03 - issue getsharditerator call trim_horizon shard iterator type, issue getrecords call shard iterator. what should expected result call? (note: did not remember/re-use shard iterator step 3)

for step 5 above, it's been more 24 hours since "item=a" message published on stream , minute since "item=b" published. fresh shard iterator trim_horizon give earliest available record, or need need keep iterating until hit time period when has been published?

i'd been experimenting kinesis , working fine yesterday or 2 days ago (ie. publishing , consuming without issues). made additional modifications code , began publishing again today. when fired consumer, nothing coming out @ after letting run few minutes. tried publishing , consuming @ same time, , still nothing. after manually playing after_sequence_number iterator type, , using sequence numbers consumer logs few days ago, able reach published messages. if go using trim_horizon type, see no messages @ all.

i've looked @ docs, of docs found assume using kcl (i using kcl initially, when started failing dropped down raw api calls) , mention must have application name , dynamodb tables used tracking state. best can tell not true if you're using pure kinesis api calls or kinesis cli, both of tried. wrote pure api script start trim_horizon , poll infinitely , hit new records (took ~600 iterations; started out 14hrs behind "now" , found records @ 5 hours behind "now"). if expected behavior, seems wording in docs little confusing/misleading:

trim_horizon - start reading @ last untrimmed record in shard in system, oldest data record in shard.

i assumed (now seemingly incorrectly) terms "oldest data record" meant record i've published stream, not time period in stream.

it'd great if can confirm/explain behavior i'm seeing.

thanks!

it's @ trim horizon, or horizon stream trimming happens.

the shard iterator may 0 records when called, you'll need keep iterating reach area oldest record (if push infrequently stream or have time gaps). getrecords give next shard iterator can use iterate.

from doc: http://docs.aws.amazon.com/kinesis/latest/apireference/api_getrecords.html

if there no records available in portion of shard iterator points to, getrecords returns empty list. note might take multiple calls portion of shard contains records.

Search This Blog

TSQL

amazon web services - Expected behavior for AWS Kinesis ShardIteratorType TRIM_HORIZON -

Comments

Post a Comment

Popular posts from this blog

java - WARN : org.springframework.web.servlet.PageNotFound - No mapping found for HTTP request with URI [/board/] in DispatcherServlet with name 'appServlet' -

android - How to create dynamically Fragment pager adapter -

1111. appearing after print sequence - php -