apache kafka - Confluent Platform: Schema Registry Subjects -


working confluent platform, platform offered creators of apache kafka, , have question:

in documentation of schema registry api reference, mention abstraction of "subject". register schema under "subject" of form topicname-key, or topicname-value, yet there no explanation why need (as implies) separate schema key , value of messages on given topic. nor there direct statement effect registration "subject" associates schema topic, other mnemonically.

further confusing matters, subsequent examples ("get schema version subject" , "register new schema under subject") on page do not use format subject name, , instead use topic name "subject" value. if has insight a) why there these 2 "subjects" per topic, , b) proper usage is, appreciated.

confluent schema registry bit inconsistent subject names :)

indeed, kafkaavroserializer (used new kafka 0.8.2 producer) uses topic-key|value pattern subjects (link) whereas kafkaavroencoder (for old producer) uses schema.getname()-value pattern (link).

the reason why 1 have 2 different subjects per topic (one key, 1 value) pretty simple:

say have avro schema representing log entry, , each log entry has source information attached it:

{    "type":"record",    "name":"logentry",    "fields":[       {          "name":"line",          "type":"string"       },       {          "name":"source",          "type":{             "type":"record",             "name":"sourceinfo",             "fields":[                {                   "name":"host",                   "type":"string"                },                {                   "name":"...",                   "type":"string"                }             ]          }       }    ] } 

a common use case want partition entries source, have 2 subjects associated topic (and subjects revisions of avro schemas) - 1 key (which sourceinfo) , 1 value (logentry).

having these 2 subjects partitioning , storing data long have schema registry running , producers/consumers can talk it. modifications these schemas reflected in schema registry , long satisfy compatibility settings should serialize/deserialize without having care this.

note: further information my personal thoughts , maybe don't yet understand how supposed work might wrong.

i more how kafkaavroencoder implemented rather kafkaavroserializer. kafkaavroencoder not in way enforce use 1 schema per topic key\value whereas kafkaavroserializer does. might issue when plan produce data multiple avro schemas 1 topic. in case kafkaavroserializer try update topic-key , topic-value subjects , 99% break if compatibility violated (and if have multiple avro schemas different , incompatible each other).

on other side, kafkaavroencoder cares schema names , may safely produce data multiple avro schemas 1 topic , should work fine (you have many subjects schemas).

this inconsistency still unclear me , hope confluent guys can explain if see question/answer.

hope helps you


Comments

Popular posts from this blog

android - MPAndroidChart - How to add Annotations or images to the chart -

javascript - Add class to another page attribute using URL id - Jquery -

firefox - Where is 'webgl.osmesalib' parameter? -