nunojob:~ dscape/08$ echo The Black Sheep

Posts tagged ‘IBM’

IBM DB2 Express-C em versão mac

DB2 for Mac

DB2 for Mac

É oficial. A versão GRÁTIS do DB2 está disponivel para download para mac.

Acabaram-se as desculpas do não quero outra maquina virtual para correr isso, nem sequer experimento.

Eu sei que sou suspeito para falar já que faço parte da equipa DB2. A análise que vou aqui fazer é muito influenciada pelo meu dia a dia no trabalho mas o que escrevo aqui é a minha opinião pessoal .

A IBM não trabalha no DB2 para pessoas como nós que têm uns sites jeitosos com alguns milhares de hits diários (com sorte). Eles fazem isto para aguentar soluções de escala gigante, algumas com standards pesados em  XML de agências governamentais, financeiras, health-care, etc, que transaccionam quantidades enormes de informação diáriamente. Essas empresas não só tem que minar os dados como fazer queries sobre eles de uma forma bastante intensiva. Estou a falar das maiores empresas americanas, e não o digo decor. Ouvi-o da boca de DBAs da Merrill Lynch, Barclays, ONU, Morgan Stanley, etc.. Que tem eles em comum? Todos eles usam DB2 e estão interessados em usar as funcionalidades XML do produto.

Já agora ninguém confia que seja possivel ter performance em XML certo? Bem a IBM tem pessoas inteligentes (como eu, lol) a trabalhar em tornar isso possivel. Deixo este link para vos aguçar o apetite. Claro que a performance não será a mesma que SQL mas comparado com os parsers xml que andam a usar… eheh. Exprimentem. :P

Como já descrevi o cliente normal do DB2 é facil constactar que não é feito para vender a José, ao Joaquim. Nem sequer a pequena empresa da Josefina. A versão Express-C é gratís para todos por isso mesmo. As limitações são um máximo de 16Gb de ram e 4 processadores na maquina.

Se isto parece razoavel:

DB2 for Mac Download

DB2 for Mac Download

Depois contem como correu e se precisarem de umas dicas podem sempre entrar em contacto.

Footnote: Para os interessados se estão a desenvolver algo com um standard xml estranho  a probabilidade desse standard ser suportado pela ibm é grande e pode ser consultado aqui.

Apache CouchDB

It’s official. yey! Great news :) 

Damien’s post.

NYU visit to hawthorn

I had to pleasure to be one of the hosts for NYU group visit in Hawthorn. The students were very lucky and got to see really good talks. Robert Morris presented explained why services are so important for IBM and how research applies to business. It was really an excellent spontaneous talk and it was to bad he didn’t stick around so I could talk to him about it. Then they got to see talk on Stream Processing and how to manage real time processing and analysis for quantities of data that are possible considered impossible to handle by most common people. I also managed to see Grady Booch for the first time, in second life. He also gave an great talk about the beauty of software (www.booch.com/architecture/blog/artifacts/Turing%20Lecture.ppt) and showed the potential of second life for this kind of event. By then both Robert and Grady had my curiosity set on the future in what concerns intellectual property but I guess that’s not a good subject for showing IBM to graduate students :) eheh. It finished in beauty  with Laxmi talking about the famous Genographic Project (https://www3.nationalgeographic.com/genographic). For those who still don’t know the genographic project is a joint venture between IBM and National Geographic where they try to find out more about the history of migrations using DNA. Cool!! The social aspect was also really important and I think students were happy with the feedback we gave them about IBM. By the way the lack of blog posts is mostly because Twitter is taking over all my blog post that can fit in under one hundred and something chars.

Some Pictures from New York

Thank you Susan for taking them while I still haven’t bought a decent camera :) I will probably have some more when Mailar decides to put them online. Maybe once he returns from SVL.

By the way doesn’t that duck look menacing? :\ More pictures? Here.

No reason to disapear

Well ever since I started working time for my blog has been less. I really dislike that as this little corner gives me a lot of joy not only in the moment of writing but also for the great joy that is sharing information and enabling others to learn new and exciting technologies. That’s one of the things I want to talk about today. The power of sharing.

Last two weeks I was contacted by people (mostly experienced people) that were using either my k-anonimity algorithm or my open source online testing system. The fact that people are using my silly scripts is rewarding enough but the networking opportunities and being a person enabler is really what thrills me the most. It’s pleasant to see people overcome their difficulties, help them learn a new technology and then seeing the final result. I hope to continue receiving your kind emails about my work. I will always have 5 minutes to help.

I think that if you are a blogger but you don’t share what your doing your missing out. Your code will vanish into some CD or will just remain in some folder, buried forever. Let your code live. Let it enable others. Let it grow. In the meantime it will help you get wherever you want. Actually I got another exciting news just for sharing. But unfortunately I cant disclose them just yet. It involves being invited to help writing a free Ruby on Rails book. Sorry, had to say it :) How cool is that?

Actually, the most boring thing about working at large corporation is the fact that you cannot share what you do. It no longer belongs to you and even if it’s ok to share it you have to follow a complex process to get authorization to do so. So no one does! Everybody is waiting to sue so you simply can’t do things that you assume would be ok. Referring a copyright brand without referring that the brand belongs to x can cost the company a significant amount (and your job :P) and they simply can’t risk it. But oh well. The least you can say is that the good things cancel that out.

I have beenworking on DB2 and I’m widely surprised by the quality of the pureXML support and overall behavior of the solution. The lack of advertising and tools that IBM distributes with DB2 doesn’t give you the same amount of trust that other better advertised and with better tools databases do. But when you get down to business you will realize that DB2 and Oracle are on a league of their own. Just talking to the clients will make you realise that they buy because other databases can’t handle the amount of information. And if you do XML, for the next 6 months at least I’m sure that DB2 will be miles ahead. How much longer will it take for companies to realize that storing xml in clobs ain’t probably the best solution? Anwser: Not much…

Oh, final note! I finally picked the theme for my masters thesis. It will be about data mining in xml databases so if you feel like you have a nice tittle suggestion for it just leave it in the comment box. It will be greatly appreciated.

Home Alone

Bem isto de ter uma casa sozinho dá algum trabalho extra. Já pensei em contractar uma empregada mas pobre como sou tenho mesmo que ser eu a cozinhar, lavar roupa e a loiça, limpar a casa. Pelo meio lá vou inventando os mais recentes cozinhados. No IBM Café comi um salmão com salada grelhado muito bom (já agora as Chicken Quesadillas também são algo) e hoje decidi tentar cozinhar o dito peixe. Nada difícil, apenas cebolada e um pouco de tomate cortado em cubos. Acompanhado por um daqueles arrozes que qualquer um – menos o Miguel Regedor que mete leite no arroz – consegue cozinhar. Tenho pena que o mac tire estas fotos manhosas que nem dá para ver nada de jeito.

Quem quiser a receita, deve tar pelo twitter. O habitual. Ah, comprei um Shufle. O raio do telemovel está com a bateria toda lixada e não conseguia ouvir música. Podia escrever uma review, mas já toda a gente sabe que é dos melhores produtos que a mac alguma vez fez.

Guia para usar windows

Não está a funcionar? Reboot, esperar 25 minutos, usar enquanto está a carregar uns programas. Começar a trabalhar com 75% da ram ocupada e o processador com vários processos a concorrer por tempo. Bem bom..

Depois do reboot continua a não funcionar? Format. Agora já funciona! :) 75% da memoria ocupada, o java irá provavelmente crashar o computador nas próximas duas horas. Bluescreen assegurado uma vez por semana.

É isto que o pessoal chama enterprise ready. Yeah right. Montam coisas em cima de coisas sem se preocuparem com a qualidade.

Use linux! três razões (podiam ser 30 mil mas fico por aqui)

  • Mais barato
  • Mais fiavel
  • Mais rapido
  • Menos crashes
  • Nenhuns reboots
  • Nenhums formats
  • Já alguem usou MSDOS? Não desejo isto a ninguém. É tortura.
  • Tem problema? Vê o codigo e submete a fix.
  • Por ultimo, e para completar esta lista de três. É MELHOR!

Obrigado é tudo.

Open Source Online Testing System (OSOTS)

I simply can’t believe that I forgot to post this project in my blog. Anyway if you have to deploy a multiple choice test and you feel like messing arround with Ruby on Rails and DB2 here goes the link.

Open Source Online Testing System (OSOTS)

Crise da bolsa afecta maiores empresas de IT

Não é novidade para ninguém. Eu sei.

Mas nem toda a gente tem a noção exacta do medo que esta crise esta a criar nos americanos. Na Yahoo – por exemplo – já houve despedimentos. Os receios vão desde o simples perder dinheiro para as universidade dos miúdos por causa das acções, passando pelo medo da ruptura em bens essenciais como gás, comida e gasolina até ao medo de um ataque eminente devido a aparente fragilidade dos EUA perante esta crise. Certo é que, mesmo não sendo analisa, o perigo de uma possível reacção em domino é enorme. Fica o gráfico das acções das empresas de informática mais representativas (segundo a minha opinião, claro) no dia em que as acções da IBM baixaram dos 100 USD.

Bem vou exprimentar o IBM Lotus Notes para Mac OS. Se correr bem vou tentar meter a w3 a correr no mac. Já agora já testaram o software da greenhouse? Se conseguirem avisem, eu já testei em 3 browsers (firefox, ie e chrome) e não consegui.

IBM visita universidades com o programa DB2 on Campus

O programa DB2 on Campus, que pretende aproximar a comunidade estudantil deste motor de base de dados da IBM, vai visitar Portugal.

Eu estive muito envolvido neste programa sendo distinguido pela IBM pelos meus esforços enquanto um dos primeiros DB2 Ambassadors Portugueses. Dois exemplos:  o desenvolvimento do OSOTS e uma demonstração sobre METS Standard with IBM DB2 Express C using XForms as user-interface and Ruby on Rails as a Rest Web-services.

Foram estes projectos que facilitaram a minha visibilidade junto de alguns IBMers, o que possibilitou que fosse contratado para trabalhar no IBM Watson Research Center em Nova Iorque. Essa é, alias, a única razão pela qual não vou estar presente. Não vou estar em Portugal.

Quem estiver interessado em bases de dados e XML, ou talvez apenas em conhecer outros IBMers, pode/deve dar um salto ao DB2 on Campus. O evento vai ser realizado em duas universidades: Universidade do Minho e Instituto Superior Técnico.

Instituto Superior Técnico (TagusPark)

Universidade do Minho

IBM Data Studio Developer 1.2 Official Release

Foi lançado hoje a nova versão do IBM Data Studio Developer.

Desde que entrei no DB2 Students Group na condição de embaixador tive o privilégio de aprender e mexer com algumas soluções da IBM. Entre o rol de coisas que me agradaram (não vamos falar de coisas tristes, como o IBM_DB adapter para Ruby que na altura estava bastante desactualizado) foram a possibilidade de usar alojar xml nativamente na base de dados. O pureXML com suporte para XQuery foi algo que usei durante o semestre passado em projectos como o Open Source Online Testing System. O facto da LeFevre ter implementado o Mondrian no DB2 também é um ponto a favor para qualquer maluquinho da segurança dos dados. E todos tiveram a oportunidade de ver que implementei o Mondrian no ultimo semestre no âmbito da unidade curricular de criptografia e segurança de sistemas informáticos.

O Data Studio Developer é uma ferramenta para developers que usam DB2. Entre as novidades da nova versão está o suporte para PureQuery e o facto de usar o mesmo tipo de tecnologia de persistencia que a usada no Project Zero. Continuam interessados? Então podem sacar aqui.

Mondrian Multidimensional K-Anonymity in Ruby

Article: Mondrian Multidimensional K-Anonymity

Lame Ruby Implementation:

# ==================================================================================
# anonymization: group.rb
# ==================================================================================
ENVIRONMENT = 'release' #'release'

require 'set'
require 'rubygems'
require 'ruby-debug' if ENVIRONMENT == 'debug'

# ==================================================================================
# class group
#
# usage:
#  require 'group'
#
#  g = Group.new <quasi_ids>, <filename>
#  g.anonymize <k>
#
# example:
#
# lefevre.db
# 
#     0             2   < -- quasi_ids
#
#   |age|  sex  | zipc | disease      |
#---+---+-------+------+--------------+--
# 0 | 25  Male    53711 Flu           |
# 1 | 25  Female  53712 Hepatitis     |
# 2 | 26  Male    53711 Bronchitis    |
# 3 | 27  Male    53710 Broken_Arm    |
# 4 | 27  Female  53712 AIDS          |
# 5 | 28  Male    53711 Hang_Nail     |
#---+---+-------+------+--------------+--
#
# irb
#  >> require 'group'
#  >> g = Group.new [0,2], 'lefevre.db'
#  >> g.anonymize 2, 'degen'
# ==================================================================================
class Group
  # create a setter method for @tuples, @filename
  # so that g.tuples = x works
  attr_writer :tuples, :filename
  
  @@debug = { 'best_attribute' => ENVIRONMENT == 'debug',
              'intersection'   => ENVIRONMENT == 'debug',
              'split'          => ENVIRONMENT == 'debug',
              'ordering'       => ENVIRONMENT == 'debug',
              'vars'           => ENVIRONMENT == 'debug',
              'args'           => ENVIRONMENT == 'debug'
           }
  # ================================================================================
  # to create a new group with Group.new
  # ================================================================================
  # needs to remove the full_ids from the read.
  def initialize(quasi_ids, filename, depth=0, available_ids=nil)
    # if no valid attributes are given quasi are used
    available_ids = quasi_ids if available_ids.nil?
    
    # initialize the instance vars
    @tuples = []
    @quasi_ids = quasi_ids
    @available_ids = available_ids
    @depth = depth

    # serves as wilcard so that no file is read on recursion
    filename == '*wc' ? @filename = nil : @filename = filename
    
    if @@debug['args'] and @depth == 0
      debug_puts "args : file => #{@filename}"
      debug_puts "args : k => #{@k}"
      debug_puts "args : quasi_ids => #{@quasi_ids.to_s}"
    end
              
    
    # run the read and backup procedures
    read
  end
  
  # ================================================================================
  # anonymization
  # ================================================================================
  def anonymize(k, heuristic='degen', partial_order=[])
    
    if @@debug['vars']
      #debug_puts "dvars : @tuples #{@tuples}" 
      debug_puts "dvars : @available_ids #{@available_ids},"
      debug_puts "dvars : @depth #{@depth}"
    end

    # stop case
    if isnt_splittable? k
      debug_puts "dsplit: no split available for k-level #{k} with size" +
                 " #{@tuples.size}" if @@debug['split']

      # sort and generalize remaining attributes
      @available_ids.each do |attribute|
        sort attribute
        generalize attribute
      end

      # exit
      return
    end

    # where and in what attribute should we split
    # these functions have a heavy effect on the usefulness of the information
    # for the k-anonymity table
    split_attribute  = find_split_attribute @available_ids, heuristic, partial_order
    split_pos        = find_split_position split_attribute

    # create the groups for the 
    # recursion
    group1 = Group.new @quasi_ids, '*wc', @depth + 1, @available_ids.clone
    group2 = Group.new @quasi_ids, '*wc', @depth + 1, @available_ids.clone

    # split at the given position
    split split_pos, group1, group2

    if split_groups_satisfy_k_anonymity?(k,group1,group2)
  
      debug_puts "dsplit: no more split available with attribute" + 
          " #{split_attribute} (g1: #{group1.size}, g2: #{group2.size})" if @@debug['split']

      # generalize by split_attribute and then remove it from the available
      # attributes array
      generalize split_attribute
      @available_ids.delete split_attribute

      # anonymize remaining available attributes
      anonymize k, heuristic, partial_order

    else # splitting successful
      debug_puts "dsplit: splitting on attribute #{split_attribute} at" +
                 " position #{split_pos} of #{@tuples.size}" if @@debug['split']
      
      # assign the two groups to this instance
      @group1 = group1
      @group2 = group2
      
      group1.anonymize k, heuristic, partial_order
      group2.anonymize k, heuristic, partial_order
      
      #@tuples = []
    end
  end

  # ================================================================================
  # io and backup related 
  # ================================================================================
  # read @tuples from @filename
  def read
    unless @filename.nil?
      f = File.open @filename
      f.each_line do |line|
        @tuples < < line.rstrip.split("\t\t")
      end
      f.close
    end
  end
  

  # reset the class to reuse
  def reset
    @available_ids  = @originally_available_ids
    @tuples = []
    read
  end

  # ================================================================================
  # overrides
  # ================================================================================
  # number of tuples
  def size
    @tuples.size
  end

  # ================================================================================
  # aux
  # ================================================================================
  # to_s
  
  def to_s
    
    str = ""
    
    unless @tuples.empty?
      @tuples.each do |line| 
        @tuples[0].size.times { |i| str << line[i].to_s + "\t\t"}
        str << "\n"
      end
    end

    str
  end

  # shows a yaml representation of internal object
  def to_y
    require 'yaml'
    y self
  end

  private
  
  def debug_puts(message)
    ident=''
    @depth.times {|i| ident+="  "}
    puts ident + message
  end

  # ================================================================================
  # aux for anonymization
  # ================================================================================
  # finds the attribute with the largest range. According to LeFevre this is a good
  # heuristic to find the attribute on
  def find_split_attribute(attributes_list, heuristic, partial_order)

    debug_puts "dorder: choosing from" + 
               " #{attributes_list.to_s}" if @@debug['ordering']

    best_attrib = -1
    best_attrib_count = 0.0

    attributes_list = find_minimal_elements partial_order, attributes_list

    debug_puts "dorder: minimal list is" +
               " #{attributes_list.to_s}" if @@debug['ordering']

    attributes_list.each do |attribute|
      values = @tuples.map{|t| t[attribute]}.to_set
  
      # degen heuristic: split on the attribute that had more degeneracy
      if heuristic == 'degen'
        if values.size < best_attrib_count or best_attrib == -1
          best_attrib = attribute
          best_attrib_count = @tuples.size.to_f / values.size.to_f
        end
      elsif heuristic == 'single'
        if values.size < best_attrib_count or best_attrib == -1
          best_attrib = attribute
          best_attrib_count = values.size
        end
      else #default
        if values.size > best_attrib_count
          best_attrib = attribute
          best_attrib_count = values.size
        end
      end
    end

    debug_puts "dbest : best atribute is #{best_attrib} with" + 
               " count #{best_attrib_count}" if @@debug['best_attribute']
    
    return best_attrib
  end
  
  #  returns the position of the leftmost or rightmost median element.
  #  used to split in lhs and rhs 
  def find_split_position(attribute_id)
    sort attribute_id
    
    median_pos = @tuples.size / 2
    median = @tuples[median_pos][attribute_id]
    
    split_pos_high = median_pos
    split_pos_low  = median_pos
    
    # split point correspond to highest index that has median value
    split_pos_high += 1 while (@tuples.size >= split_pos_high + 2) and
                              (@tuples[split_pos_high + 1][attribute_id] == median)
      
    high_smaller_group_size = 
            [split_pos_high + 1, @tuples.size - split_pos_high - 1].min

    # split point correspond to lowest index that has median value
    split_pos_low -= 1 while (split_pos_low > 1) and
                              (@tuples[split_pos_low - 1][attribute_id] == median)
    
    low_smaller_group_size = 
            [split_pos_low, @tuples.size - split_pos_low].min
    
    # choose the one with the largest group
    if high_smaller_group_size > low_smaller_group_size
      split_pos = split_pos_high
    else
      split_pos = split_pos_low - 1
    end
    
    return split_pos
  end
  
  # finds minimal elements from the list of the given attribute list according to
  # partial order specified in partial_order. partial_order contains all complete chains.
  def find_minimal_elements(partial_order, possible_elements)
    
    if partial_order.empty?
      debug_puts "dorder: no ordering specified" if @@debug['ordering']
      
      return possible_elements
    end

    # choose all possible_elements that arent in partial_order
    # those are minimal
    minimal_list = possible_elements.select { |element| !partial_order.flatten.member?(element) }
    
    # haskell goodies ^^
    # restrict partial_order to values in possible_elements
    restricted_partial_order = partial_order.map { |l| l.select { |element| possible_elements.member?(element) } }
    
    if @@debug['ordering']
      debug_puts "dorder: possible_elements list is" + 
                 " #{possible_elements.to_s}"
      debug_puts "dorder: partial_order list is" +
                 " #{partial_order.to_s}" 
      debug_puts "dorder: restricted_partial_order is" + 
                 " #{restricted_partial_order.to_s}"
    end

   non_zero_chains = restricted_partial_order.select { |chain| not chain.empty? }

   non_zero_chains.each do |c|
     candidate = c[0]
     
     minimal = !restricted_partial_order.any? do |chain|
        chain.member?(candidate) and chain[0] != candidate
     end
     
     if minimal and not minimal_list.member?(candidate)
       minimal_list << candidate
     end
   end

   return minimal_list
  end

  # replaces attribute value with generalization that cover all tuples.
  # Expects tuples to be sorted by attribute.
  def generalize(attribute)
    min_val = @tuples[0][attribute]
    max_val = @tuples[-1][attribute]
    
    unless min_val == max_val
      @tuples.each do |t|
        t[attribute] = [min_val, max_val]
      end
    end
    
  end

  def split(split_pos, group1, group2)
    group1.tuples = @tuples[0..split_pos]
    group2.tuples = @tuples[split_pos+1..@tuples.size]
  end

  def sort(attribute)
    @tuples = @tuples.sort_by { |t| t[attribute] }
  end
  
  # ================================================================================
  # verbose conditions
  # ================================================================================
  def isnt_splittable?(k)
    k < 2 or group_cant_be_split_for_level?(k) or no_split_attributes_are_available?
  end
  
  def group_cant_be_split_for_level?(k)
    @tuples.size < 2*k
  end
  
  def no_split_attributes_are_available?
    @available_ids.empty?
  end
  
  def split_groups_satisfy_k_anonymity?(k,group1,group2)
    group1.size < k or group2.size < k
  end
end

# hack on array to display lists correctly
class Array
  def to_s
    "[" + self.join(',') + "]"
  end
end

Recomeçamos, NUNCA nos rendemos!

A frase é do Lars Gustafsson, no meu livro favorito “A morte de um Apicultor”. Que com um pouco de sorte o João Moura anda a ler.

DB2 Rocks

qs = Question.find_by_sql 
"select X.* from ots_schema.questions," +
  "XMLTABLE (\'$d/question\' passing document as \"d\" " + 
    "COLUMNS question_text VARCHAR(200)" + 
    "PATH \'question_text\') as X"
qs.first.question_text.lstrip
=> "Which of the following is the correct syntax to set the DB2COMM variable to TCPIP?\n  "

If DB2 was had a good DB2 driver and a ActivePureXML (or something adapter) it would so f*ckin rock. Just look at the sample. And the dynamic nature of ruby would enable the flexibility of xml documents.

Please IBM please. DB2 for mac and decent support on Ruby. Don’t make me write things like this no more:

# Once again fixing IBM_DB bugs the ugly way
# with_scope anyone?
add_index :'ots_schema.users', :login

or

t.column :document, :xml

Valid and Well-formed XML Documents in Ruby on Rails

If your working with DB2 on Rails you probably need to check if the xml document is – at least – well formed.

I found a neat plugin called validates-xml that uses REXML to see if documents are well formed.

To install it simply

svn co http://validates-xml.googlecode.com/svn/trunk/ validates-xml-read-only

And copy that folder to vendor/plugins.

You can easily integrate schema validation with a validates_xml_with_schema method. But you should use REXML as it cames with the standard ruby bundle since version 1.8.

To use it simply restart your server and

validates_xml :xmldocument

That’s it!

Yet Another Ruby on Rails Web Application – Teaser

Rails::ActiveRecord XQuery && DB2::Mac

Most rails developers use OS-X. Mostly because rails is built-in the latest release (Leopard) and TextMate offers a great IDE to use with Rails. Some could argue that it also works nicely on Windows but I really believe that Rails ain’t done to be used on a Windows Platform. Many articles and interviews with rails creator DHH second this statement. Rails is easy to develop in OS X and deploy in a Linux distro of your choice.

In my most recent screen-cast I explored an interesting technique of taking advantage of rails RESTful design and DB2 pureXML features to easily create a web-service that could query a relational databases with XML support (like DB2). As you must know DB2 Express-C is distributed freely and offers no limitation to home users/small companies. So it would be awesome to expand ActiveRecord to support xml elements, if a schema was provided to the database.

I’m aware that a pure xml database stategy would be a bad approach but there are situations when it simply makes sense. And in those situations one would profit greatly of two things wich are missing.

  • XML Support for ActiveRecord – in construction?
  • DB2 for Mac – will this ever exist?

Antonio Cangiano is creating a plugin that revolves around this concepts, but in a DB2 centered perspective. In my opinion it would be best if the rails plugin worked not only for DB2 but for any database adapter with XML support – as long as such is provided.

Let me give you a sample. Imagine that you have a database model for translating you rails application. It could be something like:

Languages

  • ID, int
  • DOCUMENT, XML

Imagine that the xml file is

<language isoname="pt-PT">
  <hello>Olá</hello>
  <bye>Até a próxima!</bye>
</language>
(...)

If this makes sense in your application then you could easily do something like register your model to observe (see observer design pattern) the session['language'] for changes and, if it’s changed, it would get all the XML for that language – it would fall back to default if such was not available – and create the hash with the values. The problem with this approach – besides making no sense for the internationalization problem! – is that in the observer model you would have to:

Language.find :first

And fetch the whole xml and then process it. If the XML document has 10MB, it would take some time. With XQuery support on ActiveRecord we could simply

Language.find :first, 
  :xcondition => ["//language[@isoname==?]", 
    session['language']]

Or, if we simply wanted to say hello in many languages – like flickr in their first page – we could simply

Language.find :first, :xcondition => '//hello'

There are just two problems that prevent rails developers from being able to do this kind of things. And those are the lack of XQuery abilities in ActiveRecord and the fact that Mac developers cannot use DB2.

METS Standard with IBM DB2 Express C using XForms as user-interface and Ruby on Rails as a Rest Web-services

HIGH QUALITY VERSION DOWNLOAD HERE

Well, here is the long promised screen-cast. The amount of topics covered is simply huge. To get you ready for the screen-cast I prepared some other more introductory screen-cast as well as some articles on these subjects. I’m sorry that I don’t have the time to document the REST, but I really advice to invest some time learning it as it’s a very pragmatic way of delivering high quality web services.

I strongly advice you to download this screen-cast from rapidshare as both Youtube and Google Videos quality is really awful. You can download it from here.

This work was really fun to do. So I hope to have the opportunity to develop it further and manage nested rest routes like /mets/1/agents to return the agents of the first submission information package (sip) using some cool DB2 pureXML features. I really feel that with a good plugin to help users take full advantage of DB2 pureXML features and a little of imagination this web-service could be of some use.

I also expect to complete the xforms model as it is not indexing a fileptr to each category when such is selected. I hope to implement this soon enough.

Here are the associated resources I developed:

And here are the other two screencasts I produced to introduce you to XForms and METS:

I also advice you to take a look at this articles. All of them where very helpful to my work.

xForms Example (CSS)

Sample Xforms Application using cssSo here comes my first ever XForms sample. If you are interested in testing them out I’ll have to give you some pointers. First off all your browser probably can’t render XForms. So you need to find a specific add-on for your browser to make it work. As this XForms where made in Mozilla Firefox with this add-on I strongly recommend that you use it to get the same results. This is not my fault, XForms is not as mature as HTML or CSS and therefor these things tend to happen. You will see what I’m talking about when I get to the Know Issues! :PNow two short notes:

  • For the load/save button to work you will need to change the path in the code. Just look for xforms:submission and replace the filepath for the one in your computer.
  • You need to comment what’s between this for the load button to work. Why? The bind is interfering with the instance replace. How do you fix it? If i knew I would have. I think that when I do remove the bind and add the schema this behavior will be fixed..

<!– remove this lines to erase load bug. Using a schema will fix this –><!– EOB: End of bug –>

Credits for the icons go to famfamfam.Here is the list of known issues (any help is welcome):

  • Not binded to the schema. That’s because I’m lasy and METS is hard work.
  • xforms:select1 won’t hide the scroll on overflow:hidden. I’m not sure this is my responsibility as the add-on doesn’t represent a full XForms implementation.
  • Submission not implemented. That was really not the point.
  • Submission button is relevant when no authorid or workid is given. Anyone?
  • Can’t style ‘add’ and ‘remove’ trigger differently from the other xforms|trigger. I tried creating a class for them having as result a weird inverted triagle. To replicate this bug you need to find a trigger and add style=”border:100px;”.

For now: that’s all folks! Have a nice week.

Creating a User Interface for XML Schema using xForms

We use XSLT all the time to convert a XML document into another format. XML Schemas are XML documents and I think it would be interesting to create the user interface and model for that specific schema using xForms. This would drastically decrease development time and would allow a quick view on what that XML Schema is all about. In other words, this would do for xForms what JAXB does for Java.

Why xForms? xForms is W3C recommendation for creating user interfaces and it’s based on the MVC pattern. This is enough to convince me, I’m all for standards.

While researching I found some works on this subject [here, here] but they are mostly on conference proceedings and therefor I thought no one had really implemented a Generic Schema to xForms transformation engine. Before giving up I asked Vítor Rodrigues about this, as IBM is know for their XML Alphaworks. Guess what? Now I have the solution.

Here’s what you need create an xForm from a Schema:

  • Download Eclipse WTP
  • Open Eclipse and browse to Help > Software Updates > Find and Install
  • Click New Remote Site
  • Name: Emerging Technologies Toolkit (ETTK) Updates
    URL: http://awwebx04.alphaworks.ibm.com/ettktechnologies/updates
  • Select it and click Finish
  • Now select Emerging Technologies Toolkit (ETTK) Updates > Emerging Technologies Toolkit – Technologies > Visual XForms Designer
  • Click next.
  • Accept the license, install the software and restart eclipse.

Now for the Visual XForms Designer I made a little screen-cast. Just click here to see it.

xForms Resources

We will post a xForms/METS/DB2 screen-cast in this website soon. To get you prepared for this I advice you to download this firefox plugin and read the following articles:

In case you are looking for good examples you can visit this wikibook or better yet… Open your console, and create a directory in your home folder. Enter the folder and write this:

svn checkout http://xforms-examples.googlecode.com/svn/trunk/ xforms-examples-read-only

These are some great samples from Google Code. Believe me I’m saving you time with this advice. You won’t need to spend hours on badly executed examples from companies that are supposed to be building xForms implementations. Outch!?

From Ubuntu/Windows to Mac OS-X Leopard

I just got my first brand new MacBook and, as one would expect, some problems emerged by leaving my usual OS combo (Ubuntu/Windows) for Mac OS X Leopard. I’m still undecided whether to install Ubuntu or use VMWare Fusion to use Linux. As for Microsoft Windows as long as I don’t need to use .NET I don’t think I’ll install it. Not even with the free licenses Microsoft provides University of Minho students.

First I had to decide what to install:

The question you are probably asking is why Firefox instead of Safari!? Because the damn unsafe add-on for del.icio.us is just that: del.icio.us! I just can’t live without it. Maybe if the guys over there worked the basics of security we could say that it’s both excellent and safe! And don’t say there’s a add-on for Safari as well. I’ll only accept it if the add-on is as good as the one provided by Yahoo.

Now we go for the fun part. The problems and solving them. Your attention please, this is only fun for me when I do it on my computer so it won’t be pleasure if you ask me to fix your computer! Joana this one is for you, I know how your mind thinks.. Almost as badly as mine. Oh well, we are related aren’t we? :\ Please release those crazy Christmas photos to the world! :X

I’ll start with a small OS free tip and then go for the Q/A part. If you are Portuguese like myself you just hate all the results that google gives you from .br domains. Most of the results are overrated and most of the times you just wish that they weren’t there. If this is the case just do your normal search in google but write -br in the end. Just test this, go to google and type governo brasileiro -br. It works ;)

  1. How do I get my mouse to behave like a normal one by removing this dreadful acceleration?

    Easy. Check Rúben’s blog @ http://blog.0×82.com/2007/12/mac-os-and-usb-mouse.html

  2. What’s wrong with the encoding on TextEdit? The automatic descovery does not work and even when I specifically tell him to use this encoding it won’t work. The Where is ISO 8859-15 dilemma.

    Still don’t know. Anyone? If you want to test the bug just do this (from here):

    $ echo é > ~/a.txt
    $ cat ~/a.txt
    é
    Open a.txt with TextEdit and get:
    È

    Cracked it! At least I have a partial fix for Migrating documents from Ubuntu to Mac OS X Leopard:

    cmd + space terminal enter

    {go to the directory where the file is}

    bash$ iconv -f ISO-8859-15 -t UTF-8 {filename} > UTF8-{filename}

    For me it was:

    bash$ iconv -f ISO-8859-15 -t UTF-8 Pam.tex > UTF8-Pam.tex

    And yes PAM is as in Pluggable Authentication Modules. So I have a latex work on that :P

    
    
  3. Where is type-a-name application? It’s not on the dock.

    Just press cmd + space and type what you are looking for. This is called spotlight. Either this or open Finder and go to applications.

  4. Why doesn’t it show my dashboard when I go to the left/whatever corner?

    Have you even configured your mac? Common.. Click the apple on the top left corner and go to System Preferences. Exposé and Spaces. And now that you are there take a look around and configure all the other things that you neglected.

  5. This computer is always underlining the words in red. (when the spelling is right :P If not you are a dumbass for asking this!)

    Just press cmd + mouse and change the language. Thanks Nuno Veloso from Apple Tech Team for this tip.

  6. How can I see the right mouse button menu?

    crtl + mouseclick

  7. PageUp/PageDown!?

    fn+ up
    fn + down

If after this you still feel like reading about Mac OS for rookies – like myself – you can check this section in Arun Gupta’s Blog or check this extensive list of Keyboard Shortcuts. Have fun!

Now let’s get back to work, something almost impossible when your doing a report on LaTeX and the encodings are never right.

One final touch: ctrl + alt + cmd + 8. Merry Christmas! :)

Featured DB2 Student Ambassador

Featured IBM DB2 Student Ambassador

I’m really happy to announce that I was selected as featured DB2 Student Ambassador on IBM website. I really would like to thanks Vítor Rodrigues for all the help, for being a my DB2 mentor and most of all a good friend! I also feel obliged to mention the work of the other DB2 Ambassadors that have helped to establish the first Portuguese DB2 Study Group. So Simão, Hélder, João, Nuno and Marcelo thank you :) Finally I would like to thank the support of CeSIUM and CAOS, two students organization that I am a part of and that provided the hardware and know-how that made the site, blog and DB2 server a reality.

I hope that with this many Portuguese students like myself see that your work can be recognized – even by industry leaders like IBM – if you just go the extra mile. So do it! It’s your life, live it to the the fullest.

Para ser grande, sê inteiro: nada
Teu exagera ou exclui.

Sê todo em cada coisa. Põe quanto és
No mínimo que fazes.

Assim em cada lago a lua toda
Brilha, porque alta vive

Ricardo Reis (Fernando Pessoa)

Ruby 1.9

De qualquer forma, até para tirar as ideias das aberrações que têm acontecido na minha universidade (mais uma vez), achei relevante o post que ele fez sobre a performance do Ruby 1.9. Aconselho a leitura. Fica, como resumo, os resultados obtido comparando a função de fibbonacci para os primeiros 36 números.

Ruby 1.8.6:       158.869s

Python 2.5.1:      31.507s

Ruby 1.9.0:        11.934s

Paul Rand e o Labirinto do Fauno

Se há blog que gosto de ler é o do Bruno Júlio. Recentemente tirei dos meus feeds o planeta onde ele costuma escrever mas tive o cuidado de migrar o feed dele para o meu google reader.

E é por gostar tanto dos posts dele que lhe vou voltar a fazer a desfeita de publicar um vídeo que ele disponibilizou no seu blog. (Desculpa Bruno!) Achei lá este vídeo fantástico de tributo a Paul Rand, criador dos logotipos da IBM, UPS e abc. Podem ler mais sobre este senhor na sua página na wikipedia.

Aproveito para dizer que não irei publicar os cinco filmes que mais gosto. Não porque seja contra a iniciativa mas simplesmente porque não consigo escolher cinco. Mas adianto um que provavelmente poucos de vos viram e todos deveriam ver. O vencedor do ano passado do fantasporto, El Laberinto Del Fauno. Simplesmente divinal.

Dito isto, fica o tributo a Paul Rand.

DB2 Express-C 9.5 Released

A new version of DB2 Express-C was made available by IBM today. As you might already know DB2 Express-C is a free dataserver without limits on database size or number of users.

We got this news from

You won’t find this new version on IBM DB2 Express-C Website, but you can download the Linux Version here and the Windows Version here.Mac version is still not available. This is probably the down side as many of us are waiting for IBM to ship a Express-C DB2 version for Macintosh. Hopefully next time?

I still didn’t tested the new DB2 but according to Antonio

Version 9.5 brings to the table a great deal of new enhancements and features. Among these there is also the reduced amount of memory requirements and the improved efficiency of operations on Linux. Oh and it will also make you rich and more beautiful. I mean, pureXML support has been greatly improved and it is a release that is sure to please many people on both Windows and Linux.

I know the t-shirt looks great. Hopefully Vítor will send me one! :P

Nova Direcção do CeSIUM

Com 31 votos a favor, um agrafador, uma tesoura e uma telefone foi hoje eleita a nova direcção do Centro de Estudantes de Engenharia de Sistemas e Informática da Universidade do Minho. Contra apenas três votos em branco (e um par de sapatilhas).

O seu novo presidente deve andar num dia de sorte já que também hoje foi anunciado como novo IBM Cenas. Parabéns!

Pessoalmente sou membro do CAOS, Centro de Apoio ao Open Source, que também faz parte do CeSIUM. Este é sem dúvida o projecto de informática feito por alunos mais dinâmico e motivante da Universidade do Minho. Espero que, num futuro próximo, sejam muitos os estudantes que possam ganhar com a divulgação e apoio que prestamos na área do software livre. Para já SVNs para membros e um mirror de software livre. Em breve outras novidades, como por exemplo um workshop de LaTeX para iniciantes dado pelo Nuno Veloso e o Ulisses Costa. Já ouvi falar também se fornecer alojamento aos membros (LAMP), com uma razoável quantidade de espaço, gratuitamente. Eu cá espero que seja LAMPr :P

É preciso ter Ruby!

Para finalizar, a todos aqueles que me conhecem como hacker da optimus, sejam bem vindos.

Para tutoriais sobre como hackear a optimus visitem esta página.

IBM DB2 Student Group

Bem como já referi num post anterior sou DB2 Student Ambassador na Universidade do Minho. Acabamos agora de instalar e configurar o blog do projecto. Para já ainda está muito verdinho, mas esperamos que saiam boas coisas daqui. Quando tiver menos sono :P

Vai um leitinho com chocolate? Claaaaro..

[EDIT: Será que colar imans num micro-ondas é boa ideia!?..

E tirei eu 16 no exame de electricidade e magnetismo.. tsc tsc..]

IBM DB2 Student Ambassador

Andava a ver o blog o Celso Martinho com um artigo muito interessante sobre como fazer citações a partir de um blog para, por exemplo, usar num trabalho cientifico quando recebi um email que já esperava à algum tempo.

Foram anunciados os primeiros IBM DB2 Student Ambassadors Portugueses e, para minha felicidade, fui um dos escolhidos. Os outros dois representantes são o Hélder Pereira e o Simão Soares, sendo que a Universidade do Minho é a primeira a constituir um grupo deste tipo em Portugal.

Vai ser óptimo ter a oportunidade de trabalhar de perto com uma empresa como a IBM, especialmente quando os colegas que foram escolhidos são tão competentes. Novidades em breve.. :P

Para já, e para finalizar, a minha versão do que é um tesourinho deprimente.

Um dia cheio de novidades

Primeiro, logo após a meia noite, fui convidado de forma oficial para reforçar a equipa do Centro de Apoio ao Open Source (CAOS) do CeSIUM. Convite que aceitei já que é, sem dúvida, um dos melhores centros de aprendizagem na Universidade do Minho. Ao menos a ver por quem por lá passou… Mas aquela página! Uma pessoa até fica católica, “meu deus” :P

Depois a aproximação da IBM sobre o tema do DB2 Ambassador.

Tive recentemente uma experiência muito negativa com a Microsoft num programa deste género (MSP), portanto já não sei o que esperar das grandes empresas. Talvez a IBM seja diferente porque eu não trabalho com empresas como a Microsoft. As empresas precisam de nós e devem por isso tratar-nos com respeito e dignidade profissional. Na minha opinião, não somos só nos que precisamos da Microsoft, a Microsoft também precisa de nós. Mas se tratar todos os MSP’s como tratou aquando de os convidar este ano, podem bem esquecer.

Desaconselho com toda a convicção as pessoas de sequer conversarem com a Microsoft sobre serem Student Partners. Pode ser um programa muito bom, mas apenas o façam se acham razoável subjugar-se, vender-se e aceitar “merda” dos outros. Quem não acha isto razoável, esqueçam.

Agora, já no fim do dia (e mais um bocadinho) aparecem novidades sobre o formato do concurso do Sapo Codebits. Leiam e vejam porque digo que ai está um concurso de programação digno desse nome. Que sorte vão ter os convidados a está primeira edição, pode ser que me saia a sorte, senão acompanharei de fora o evento.

Follow

Get every new post delivered to your Inbox.