Tagged: NLP RSS Toggle Comment Threads | Keyboard Shortcuts

  • Bertrand 10:59 pm on January 24, 2010 Permalink | Reply
    Tags: , NLP, , search,   

    New Search Engine 

    I have a question !

    I would like to know the actual possibilitys of making a “smart” search engine that could give you answers of your questions.

    For example, let say that i want to know the birth date of Napoleon, i’m going first on google and type Napoleon brithdate, i find few links and by habit, i click on Wikipedia. There, i look in first lines to find what i’m looking for.

    Is there a way of finding directly this information by asking a simple question like this? I mean noone care about looking stuff in search engine or encyclopedia but the answers.

    Even better, what if i speak directly in mic asking my questions and get replyed by voice. Avoiding typing and mousing gesture for earn time.  Imagine a children playing with such application for years! Or this could in reverse made the learning process almost unusefull for most people.

    Until where could we go about question complexity?

    John told me few years ago about ontologie and this really blow my mind but what now? Is there something like Wikipedia but ontologie oriented more open than a “simple” dictionary?

     
    • Daniel Radev 3:56 am on January 25, 2010 Permalink | Reply

      Currently WolframAlfa (http://www.wolframalpha.com ) is the best answer as far as I know.
      It would give you answers to questions like Napoleon birthdate and it has some other nice features (type AAPL vs GOOG for example)…

      • Neven Boyanov 8:21 pm on January 26, 2010 Permalink | Reply

        That is very interesting. I looked for various (and few very stupid) things but what surprised me was that: “Jesus Christ date of birth” – result ~ 4 BC, looks inaccurate. But it’s still better that nothing, i.e. as if the guy never existed. ;)

    • jyonkov 7:25 am on January 25, 2010 Permalink | Reply

      I agree with Daniel that WolframAlfa fits your description the best. Here are some other interesting once.
      http://www.clusty.com/ – categorizes/clusters searches.
      http://labs.hakia.com/hakia-lab.html – they are based in Turkey i think..
      http://www.powerset.com/ – from MS
      http://www.google.com/search?hl=en&esrch=FT1&tbo=1&tbs=ww:1&q=ontology&btnG=Search -Wonder Wheel
      I can’t skip WordNet http://wordnetweb.princeton.edu/perl/webwn

    • Bertrand 8:33 am on January 25, 2010 Permalink | Reply

      Ok guys, thanks for the tips. I’ll play around with.

    • Jean 4:55 pm on February 2, 2010 Permalink | Reply

      Aardvark (vark.com) seems something to check out too.
      The intelligence of the engine consist of routing your questions to experts in your social network.

      More human interaction, better?

      • jyonkov 2:18 pm on February 13, 2010 Permalink | Reply

        Thanks Jean, I signed up and i like it ! Even Answered a question trough GTalk :)

        I’ve been thinking about “universal” instant messenger with some good plugin framework that can make easy writing applications that utilize the FOAF contacts network since PMF days (PMF was a framework we wrote in Qt while working together…) but never got around to it. It would be cool if one can hack a network app with GUI in a week and publish to the world or some subset without effort… Ahh wait that sounds like Apple App Store?! … well not exactly.

  • jyonkov 7:40 pm on October 9, 2009 Permalink | Reply
    Tags: , , , NLP, , ,   

    NLP and Ontologies 

    For a while now I’ve been thinking about using knowledge representation – Ontologies as a base for creating a modular Natural Language Processing system focused on extracting structured data from unstructured. For example we can create/use Ontologies (models) that describe “simple” concepts like: Address, Time, Task, Expense, Transaction etc… and use them to “match” information from a text stream. The reason i’m writing this is because i think that there is a common ground for collaboration… I know Stefan is interested in RDF/OWL,  the company that Neven is involved is in a very near domain and finally i was playing with Google Wave which i think is a good platform for creating intelligent bots that will be very easy to distribute if they turn out to be useful :)

    Here are some references:
    http://wordnet.princeton.edu/
    http://protege.stanford.edu/
    http://jena.sourceforge.net/
    http://www.openrdf.org/
    http://code.google.com/apis/wave/guide.html

     
    • Daniel Radev 9:20 am on October 10, 2009 Permalink | Reply

      Although I’m currently not even in near domain (kernel level C driver development is not even close by any means) it is very interesting to me and would gladly participate…

      • jyonkov 10:48 am on October 10, 2009 Permalink | Reply

        Awesome, I’ll try to start with some examples “soon” :)

    • Nikolay 8:41 pm on October 10, 2009 Permalink | Reply

      I want to add a few NLP projects that I personally find interesting: openNLP, which is in Java, and NLTK, which is in Python.

      • jyonkov 6:10 am on October 11, 2009 Permalink | Reply

        Thanks Nikolay, I browsed around the links you provided and stumbled on Apache UIMA, a graduated IBM Research project which has recently been approved as an OASIS Standard. I think that it could be used as a framework in which we can plug our NLP modules when they mature.

        • Nikolay 4:10 am on October 13, 2009 Permalink | Reply

          It seems pretty good, but why it’s still in the incubator… since 2006?

    • Nikolay 6:16 pm on October 14, 2009 Permalink | Reply

      Here’s one project that uses Apache UIMA – SEASR. It has interesting stuff such as sentiment analysis.

    • Neven Boyanov 10:11 pm on October 14, 2009 Permalink | Reply

      This is very interesting topic indeed.

      Knowledge representation is just one part of the process. It is relatively simple task to represent knowledge in a digital form no mater how complex structures or algorithms you should use or how much processor power and memory you will need. But the task does not end with the representation of this knowledge, you need to do something with it.

      One of the obstacles that existed before was the enormous quantities of information that need to gathered first and then process them, but now with Google and all other web-spiders and similar, collecting the information is feasible.

      By the way, Google machine translation is mostly based on what they find on the web and they think one is translation of the other. It’s a version of the statistical machine translation. Others use other sources for parallel corpora such as books, legal documents, etc.

      I will do a parallel here, like the hypothetical best compression algorithm is very similar to a random generator – in its behavior and the result that it produces, in a similar way the perfect machine translation engine is very similar to the best knowledge representation and processing system. It should be in fact a representation of the entire human knowledge with the ability to derive new representations of it in form of one human language or another.

      I like the idea of presenting a program by what it does. The computer language and the computer behavior pair are not very different from humans’ language and respectively their behavior. One day we will be creating programs just by example given to the some kind of knowledge processor that will convert it to a machine code based on what we expect that program to do, just by giving verbal examples.

      I forgot to mention that there was a worldwide recognized NLP conference in Bulgaria, in September, that I was invited to attend but couldn’t for reasons beyond my control, organized by the Bulgarian Academy of Science. The lecturers were mostly from EU and few from US.

      Also, don’t take seriously what I’m saying here about NLP, I’m not proficient enough in that area. :P

      • Svetoslav Vencislavov Pavlov 12:36 am on October 27, 2009 Permalink | Reply

        Hello ! I’m not sure what an ontology is.I’m not a Linguist. Let’s say “ontology” is the grammar way of building a sentence(in English). As I have a simple Language processor( Phrase generator ) , we are able to connect a pattern :) /logical or even derived relational database with different “ontologies”/patterns to Phrase generator. And we are not talking about Natural language processing, and for Virtual Natural Language Processor :)
        “ontology” may become every grammatic rule.
        As you can connect one Application to different databases, you are able to create a pure Logical database based on Artifficial Intelligence with predicats connected to the Application with recurent conections(not database) based on Neural networks Theory. And so, this Application becomes a BOT to communicate with :)

        the beginning( reference ) :
        http://www.languagetool.org/ ( be a patient and smart )
        http://extensions.services.openoffice.org/node/2297

c
compose new post
j
next post/next comment
k
previous post/previous comment
r
reply
e
edit
o
show/hide comments
t
go to top
l
go to login
h
show/hide help
shift + esc
cancel
Follow

Get every new post delivered to your Inbox.