Só hoje me dei conta de que o site InfoQ do Brasil publicou o vídeo da palestra que fiz no TDC 2012, intitulada Sistemas para o Mundo Real.
É isso aí, divirtam-se!
Só hoje me dei conta de que o site InfoQ do Brasil publicou o vídeo da palestra que fiz no TDC 2012, intitulada Sistemas para o Mundo Real.
É isso aí, divirtam-se!
Pois é, acabou a moleza! Semana passada resolvi me juntar aos meus amigos da Sequenza IT Solutions; hoje será, oficialmente, meu primeiro dia de trabalho.
A Sequenza é uma consultoria relativamente nova, criada por meus amigos de longa data, Fernando Lauria e Alex Terra, que vem focando seus esforços essencialmente em desenvolvimento de software, mentoring em processos e alocação de profissionais, além de alguns produtos nas áreas de biometria e merchandising.
Como mencionei, somos uma consultoria relativamente nova (sou o consultor #11), mas já com um portifólio bem interessante, valores que casaram com os meus e excelentes perspectivas pela frente.
Vou fazer o que sempre fiz e que amo fazer:
Além disso, há uma série de idéias e projetos para serem colocados em prática, principalmente no que se refere a novos produtos e ao relacionamento da empresa com comunidades de desenvolvimento de software open source. Estou imensamente animado com o que há por vir!
Amanhã começaremos um projeto novo, muito, muito interessante, para um banco alemão. Vamos escrever uns simuladores de financiamento, usando basicamente:
Se interessou? Estamos contratando!
…amigos, desejem-me sorte!
The idea here is to use Twissandra to try to clarify the column-oriented data model employed by Cassandra, which goes beyond the simple key/value model and often is misunderstood. It’s not quite hard to understand, if you have a little familiarity with JSON-ish data structures, but is quite a bit if you don’t. On the other hand, almost everybody knows relational data model. So here lies my idea on take the relational data model as an analogy.
Wait! I don’t pretend make you think that one can use Cassandra in the same fashion that a relational database. Instead I’d want to get you introduced to that paradigm and give you some ground to further learning. Hopefully, you’ll have insight on potential use cases as well.
Here we go!
Twissandra is a sample application which aim is to demonstrate how to use Cassandra. It’s essentially a simplified Twitter clone, as you can see at http://twissandra.com.
I gently ask you to take a look on the project README, at the specific topic on Schema Layout, in order to getting knowledge of its data model (i.e. users, tweets, friends, etc), because that is going to give you underlying knowledge to follow the analogy which follows.
I believe that at this point you have read the schema layout of Twissandra. So now I think it’s a good time to put a conceptual model in place, in order to synthetize what you have read there.
Keyspace = {
ColumnFamily1 = {
RowKey1 = {
ColumnName1 = ColumnValue,
ColumnNameN = ColumnValue
},
RowKeyN = {
ColumnName1 = ColumnValue,
ColumnNameN = ColumnValue
}
},
ColumnFamilyN = {
RowKey1 = {
ColumnName1 = ColumnValue,
ColumnNameN = ColumnValue
},
RowKeyN = {
ColumnName1 = ColumnValue,
ColumnNameN = ColumnValue
}
}
}
What does it look like? Did you say a map of maps? Oh yeah, you’re right. Come on walk through each piece now.
Analogy: Database, schema
There: Twissandra
This is the top-level identifier of our schema. As such, we usually have one by application.
Analogy: Table
There:
Analogously to tables, column families are containers for rows. Each row has a key and contains a number of columns.
Analogy: Field
There:
Columns follow a name/value fashion. As such, their names can be strings, numerics, etc, and are used as indexes, since they are stored orderly. To put it simple, let’s take the Friends column family as an example, whose which each row is keyed by an username.
Friends = {
'hermes': {
# friend id: timestamp of when the friendship was added
'larry': '1267413962580791',
'curly': '1267413990076949',
'moe' : '1267414008133277',
},
}
Each row has a number of friend usernames as column names (i.e. column names can be used to store values) and timestamps for their values, thus we can easily know which users are following a given user and since when.
This data model breaks away the common relational concept of selecting records through joins among many normalised tables. Here we often design our “tables” with our future data reading in mind, not with the data storing. I mean, what is read together is indeed stored together.
There is also another type of column, whose is not employed on Twissandra, which is called Super Column. It’s a special type of column that contains a number of regular columns, what let us to something like a map of maps of maps.
Keyspace = {
ColumnFamily1 = {
RowKey1 = {
SuperColumnName1 = {
ColumnName1 = ColumnValue,
ColumnNameN = ColumnValue
},
SuperColumnNameN = {
ColumnName1 = ColumnValue,
ColumnNameN = ColumnValue
}
},
RowKeyN = {
SuperColumnName1 = {
ColumnName1 = ColumnValue,
ColumnNameN = ColumnValue
},
SuperColumnNameN = {
ColumnName1 = ColumnValue,
ColumnNameN = ColumnValue
}
}
},
ColumnFamilyN = {
RowKey1 = {
SuperColumnName1 = {
ColumnName1 = ColumnValue,
ColumnNameN = ColumnValue
},
SuperColumnNameN = {
ColumnName1 = ColumnValue,
ColumnNameN = ColumnValue
}
},
RowKeyN = {
SuperColumnName1 = {
ColumnName1 = ColumnValue,
ColumnNameN = ColumnValue
},
SuperColumnNameN = {
ColumnName1 = ColumnValue,
ColumnNameN = ColumnValue
}
}
}
}
It might be useful if we decided, for example, that we should have friend details right in the Friends column family, e.g. their long names. In this case, our Friends column family would be actually a super column family like follow.
Friends = {
'hermes': {
# friend id: timestamp of when the friendship was added and his/her name
'larry': {
'longname': 'Larry Page',
'since': '1267413962580791'
},
'curly': {
'longname': 'Curly Howard',
'since': '1267413990076949'
},
'moe': {
'longname': 'Moe',
'since': '1267414008133277'
}
}
}
This model is in line with our aforementioned philosophy of “what is read together is stored together”. And also it’s worth to mention that super column names are stored orderly by name, i.e. they’re indexed, just like regular columns are.
This blog post was undoubtedly a simplified explanation on the Cassandra data model, relying on an analogy to help people, which might be already familiar with relational data model, to getting started with the Cassandra data model.
So now that you have a basic understanding, I’d strongly suggest you to read the official explanation from Cassandra’s wiki and other good explanations, like these following:
I hope have helped you!
It was really cool to play around with klogd but I have to confess that I’d like to have more fun. So this is my aim with klogd2.
Klogd2 is essentially a new implementation of klogd but in Java, relying on Syslog4j, as I said on klogd2’s README:
I’d want to try Syslog4j on the server side, because I know it’s a rock solid stuff and all those cool kids are using it, e.g. Graylog2.
Take a couple of minutes to get a look there, when you can. As usual, I’d really appreciate your feedback and possibly a pull request.
Today I was searching for a way to route Syslog messages to Kafka, since Syslog is the standard bucket for logs on Unix-like operational systems and there are many legacy applications which use it and cannot be changed to use something else. Unfortunately, I didn’t find anything. Therefore I decided to write something to try it.
Kafka is a pretty interesting high-throughput distributed messaging system from LinkedIn’s Data Team guys, whose aim is to serve as the foundation for LinkedIn’s activity stream and operational data processing pipeline. They have used it to handle lots of real-time data everyday and have open sourced it as an Apache project. I suggest you to take a look on its design and use cases today.
The result of my first try is klogd.
It’s a dumb simple Python program which simply listen for UDP packets on 1514 port and send them to a Kafka server. Just it. So I know, of course, there are many things to be done, because klogd is still too naive. This is just the begining.
Take a time to try it and give me your feedback. Further, fork it, hack it, and send me a pull request.
Pois é, depois de quase 3 anos trabalhando intensamente na Locaweb, segunda-feira passada tomei a decisão de sair e caçar o meu rumo por aí. Ainda não sei o que vou fazer, nem para aonde vou. Por enquanto, só vou gastar um tempo com descanso, para dar uma flush nas idéias, recompor as energias, e depois então volto ao batente, insanamente, como sempre.
Resumindo esse tempo que fiz parte do time da Locaweb:
Então, o que dizer? Valeu a pena! E vai deixar saudade…
Aos meus amigos de Locaweb, meu eterno respeito e gratidão.
A gente se vê por aí. 😉
Semana passada estive no TDC 2012, apresentando minha palestra Sistemas para o Mundo Real na trilha de arquitetura. Foi bem bacana, a sala estava cheia, a galera super interessada, motivada, valeu muito a pena.
Esta foi a primeira vez que estive neste evento e gostei bastante, viu?! Achei super bem organizado e diversificado, os organizadores estão de parabéns.
Ano que vem estarei lá novamente, com toda certeza. 😉
Acabei de apresentar minha palestra Sistemas para o Mundo Real no Abril pro Ruby 2012, a primeira edição de um evento muito promissor da comunidade Ruby de Recife/PE, organizado pelo Victor Cavalcanti e seus companheiros do Frevo on Rails, e patrocinado pela Locaweb, redu, eventick e ThoughtWorks.
E muito embora seja um evento focado em Ruby, feito pela comunidade Ruby, para programadores Ruby, minha palestra não foi focada em Ruby, foi mais abrangente, mais focada em preocupações com arquitetura e operação dos sistemas em produção. Não foi nada muito profundo, foi mais um overview mesmo, para fomentar o assunto e estimular a galera a pesquisar, estudar e levar essas preocupações em conta.
A boa notícia é que tem muita gente aqui interessada no assunto.
Hi everybody!
I’ve started to write a kind of documentation about Cameron, in the spirit of a presentation, really brief and direct. It’s still a work in progress, like Cameron as well, but I think it’s already able to give you an idea on what Cameron aim to be.
Stay tuned…
A briefly introduction to Cameron