Universal Translators Are All Around Us
Since machine translations is one of the topics here is an interesting article plus some video demos
http://singularityhub.com/2009/11/23/universal-translators-are-all-around-us-video
Since machine translations is one of the topics here is an interesting article plus some video demos
http://singularityhub.com/2009/11/23/universal-translators-are-all-around-us-video
For a while now I’ve been thinking about using knowledge representation – Ontologies as a base for creating a modular Natural Language Processing system focused on extracting structured data from unstructured. For example we can create/use Ontologies (models) that describe “simple” concepts like: Address, Time, Task, Expense, Transaction etc… and use them to “match” information from a text stream. The reason i’m writing this is because i think that there is a common ground for collaboration… I know Stefan is interested in RDF/OWL, the company that Neven is involved is in a very near domain and finally i was playing with Google Wave which i think is a good platform for creating intelligent bots that will be very easy to distribute if they turn out to be useful :)
Here are some references:
http://wordnet.princeton.edu/
http://protege.stanford.edu/
http://jena.sourceforge.net/
http://www.openrdf.org/
http://code.google.com/apis/wave/guide.html
Although I’m currently not even in near domain (kernel level C driver development is not even close by any means) it is very interesting to me and would gladly participate…
Awesome, I’ll try to start with some examples “soon”
I want to add a few NLP projects that I personally find interesting: openNLP, which is in Java, and NLTK, which is in Python.
Thanks Nikolay, I browsed around the links you provided and stumbled on Apache UIMA, a graduated IBM Research project which has recently been approved as an OASIS Standard. I think that it could be used as a framework in which we can plug our NLP modules when they mature.
It seems pretty good, but why it’s still in the incubator… since 2006?
Here’s one project that uses Apache UIMA – SEASR. It has interesting stuff such as sentiment analysis.
This is very interesting topic indeed.
Knowledge representation is just one part of the process. It is relatively simple task to represent knowledge in a digital form no mater how complex structures or algorithms you should use or how much processor power and memory you will need. But the task does not end with the representation of this knowledge, you need to do something with it.
One of the obstacles that existed before was the enormous quantities of information that need to gathered first and then process them, but now with Google and all other web-spiders and similar, collecting the information is feasible.
By the way, Google machine translation is mostly based on what they find on the web and they think one is translation of the other. It’s a version of the statistical machine translation. Others use other sources for parallel corpora such as books, legal documents, etc.
I will do a parallel here, like the hypothetical best compression algorithm is very similar to a random generator – in its behavior and the result that it produces, in a similar way the perfect machine translation engine is very similar to the best knowledge representation and processing system. It should be in fact a representation of the entire human knowledge with the ability to derive new representations of it in form of one human language or another.
I like the idea of presenting a program by what it does. The computer language and the computer behavior pair are not very different from humans’ language and respectively their behavior. One day we will be creating programs just by example given to the some kind of knowledge processor that will convert it to a machine code based on what we expect that program to do, just by giving verbal examples.
I forgot to mention that there was a worldwide recognized NLP conference in Bulgaria, in September, that I was invited to attend but couldn’t for reasons beyond my control, organized by the Bulgarian Academy of Science. The lecturers were mostly from EU and few from US.
Also, don’t take seriously what I’m saying here about NLP, I’m not proficient enough in that area.
Hello ! I’m not sure what an ontology is.I’m not a Linguist. Let’s say “ontology” is the grammar way of building a sentence(in English). As I have a simple Language processor( Phrase generator ) , we are able to connect a pattern
/logical or even derived relational database with different “ontologies”/patterns to Phrase generator. And we are not talking about Natural language processing, and for Virtual Natural Language Processor ![]()
“ontology” may become every grammatic rule.
As you can connect one Application to different databases, you are able to create a pure Logical database based on Artifficial Intelligence with predicats connected to the Application with recurent conections(not database) based on Neural networks Theory. And so, this Application becomes a BOT to communicate with
the beginning( reference ) :
http://www.languagetool.org/ ( be a patient and smart )
http://extensions.services.openoffice.org/node/2297
Hi guys,
As you already know from my post on Facebook our BlackBerry product was promoted on RIM’s App World as a featured application last Friday. We are now getting thousands of downloads and activations, about 400 per hour. That is good.
We get quite good exposure not only trough App World but also from other mobile portals. Although, we need to develop our business and move the company to the next stage.
It’s been couple of months already since we started looking for new funding sources. Our company Interlecta has been privately held for almost 3 years, self funded as well, but it seems it is time for a change.
Right now we are talking to several potential investors (Corp, AI’s & VC’s) that are current or potential customers of our products, but not all opportunities look that promising or suitable for us.
So, if you think that you know someone or have friend of a friend who may know someone … any ideas are welcome.
And of course, the standard finders fee will be applied to everyone that refers an investor that turns to a deal.
According to quite few specialists that I’m talking to recently next several months will be the best time to invest in start-up’s simply because there are not that many left and those that survived are expected to have a good value.
Great news! Wish you luck! I’ll talk to our CEO about your company.
Neven,
Congratulations and good luck to your company. Being an alien to the RIM ecosphere, I am very curious about how your application approaches multi-language conversation, especially from User Experience side. Please can you post some screenshot URLs and answer few questions about your app? Thanks in advance.
Here are some screenshots: http://appworld.blackberry.com/webstore/content/screenshots/2009
Also, a flash demo here: http://media.interlecta.com/blackberry/winks/email/Email%20Translator%20-%20demo%20-%2020070725.htm
BlackBerry always strikes me as the platform that could be a great OS if it wasn’t constrained by this horrible interface meant for tiny tiny TINY screens. It’s so unfriendly to the modern age user I don’t understand how business users deal with that fact. Just my pet peeve with BB.
Thanks for all the info. I thought your app auto-sent the translated message but the manual approach works too. Having to choose menu to send as SMS, eMail, etc is a bit clunky though. I would have put context buttons under or next to the translared result to the user could tap or choose using keyboard the choice he wants rather than open menu and choose from there – that’s one click more for every translated phrase, adds up to quite many additional clicks over time. Example:
> Do you like my documents?
Translated: Vind je mijn documenten? [S]ms [E]mail [D]efault (per User)
The screens are not that small:
Only the Pearl has 260×240 but it’s not widely used anyways.
The Interlecta UI is consistent with the OS UI, i.e. most of the operations are through a context menu, navigated in most cases by pushing the roller.
So, you compose new email, but instead of choosing the [Send] from the menu you choose [Translate] and then, if you like what you see, choose [Send].
You should get a BB toy, it’s addictive.
I forgot to mention that we thought about the idea to keep the “target language” for certain people, those that are in your address book, so the default target language would be set conveniently before each translation, but in no way we will do automatic translate&send based on predefine criteria. It could be an option for advanced user but it could often lead to mistakes.
Technically, it’s doable since RIM OS APIs allow one’s application to add custom field to the address book entries, very good feature which I don’t know how many other mobile platforms support. For example all my contacts that have Facebook now have pictures in my address book (from FB) after I installed the Facebook for BlackBerry application.
Neven Boyanov 9:11 pm on December 14, 2009 Permalink |
Sakhr are good, we partnered with them in 2007/2008 for the English/Arabic. Although, I’ve never had the chance to evaluate their ASR technology.