You are here

Open Text Summarizer

5 posts / 0 new
Last post
old_man
Offline
Last seen: 6 years 9 months ago
Joined: 2008-03-20 08:48
Open Text Summarizer

Automatic text summarization is the technique, where a computer program summarizes a document. A text is put into the computer and a highlighted (summarized) text is returned. The Open Text Summarizer is an open source tool for summarizing texts. The program reads a text and decides which sentences are important and which are not.
[text from the website: http://libots.sourceforge.net/ ]

The author made an Ots-0.4.2 build, available on the same website (just click the "for Windows" icon in the bar on the left side of the screen).

This windows build works in a very rudimentary manner, by running a batch file "runme.bat", but with some tinkering it should be easy to have a somewhat more user friendly interface. Such an interface for the program exists online, on http://www.splitbrain.org/services/ots

The program normally works with the following command line options:

-r --ratio [0%..100%]
Select the summary ratio
-d --dic lang
Use alternate dictionary file specified by a ISO 639 language code, for example 'he', 'de', 'fr', 'es'
-o --out file
Redirect output results to file instead of stdout
-h --html
Print summary in highlighted html form
-k --keywords
Print key words in the article
-a --about
Print a short description of what the article is about
-v --version
Display what version of ots this is
-? --help
Display this help screen
--usage
Display brief usage message

The program works fine - as long as you don't expect it to perform miracles, but it can be very useful. The standard output if you use the runme.bat file is a text in HTML with the important sentences highlighted. I have used it several times already, and with good results. It works in other languages than English too, although the results are somewhat less refined. All you need for that is the XML file for the language concerned.

Those language files are not included in the download (only EN.XML), but can be easily obtained by downloading the file "ots-libs-0.4.2-11.fc7.ppc64.rpm" from the site
http://www.rpm-find.net/linux/rpm2html/search.php?query=libots-1.so.0%28...

This is merely a zip within a zip within a zip file. If you continue unzipping you'll eventually reach a directory "ots", where all the language XML files can be found.

Could somebody convert this into a working portable app? Possibly with a rudimentary but workable GUI? I would be very grateful...

canoji
Offline
Last seen: 9 years 9 months ago
Joined: 2014-07-01 08:23
How to use

How can I use the summarizer in a java process?

old_man
Offline
Last seen: 6 years 9 months ago
Joined: 2008-03-20 08:48
I'm afraid I can't help you

... I'm not a programmer. I don't even know if any java routines are involved in this program.

I suggest you have a look at the links given above - and maybe also at this one
https://gist.github.com/splitbrain/399029
which provides the code (I think) for the web interface.

Still think this would be a good addition to the PortableApps arsenal.

dirk

Wm ...
Offline
Last seen: 7 years 2 months ago
Joined: 2010-07-17 12:37
Example use, please

You said up top that you'd used this.

Would you give a few examples where this might be useful to other people, please.

Wm

old_man
Offline
Last seen: 6 years 9 months ago
Joined: 2008-03-20 08:48
I used it

a couple of times on biographical texts and descriptions of factual events. At first out of mere curiosity, for it seemed rather awkward that an AI should be able to do this kind of thing. To my surprise, although not perfect, the summaries it produced (at 20%) were usable.(I doublechecked them against the original.)

That goes for the English language version - as I said the other languages (I'm fluent in 5 languages, so I tried them too) produce less trustworthy summaries.

I wouldn't trust these summaries with my life, but it seems to me the program can help to reduce the sheer volume of long logically constructed texts. There are commercial programs on the market that do the very same thing.

If this is of any interest to others is hard for me to tell...

dirk

Log in or register to post comments