UMLSKS Web Services

UMLSKS Web Services

The Federal Government provides a truly astounding collection of scientific resources for free to it’s citizenry.

It really never ceases to amaze me at the amount of data and even code they make available.

The UMLSKS is a good example of this.

The Unified Medical Language System (UMLS) Knowledge Sources (KS) are a collection of documents and services useful for creating knowledge based application focused on medical action.

From their site:

The Unified Medical Language System (UMLS) Knowledge Sources
and related lexical programs, developed at the U.S. National
Library of Medicine (NLM), provide access to the UMLS. The
Metathesaurus, the Semantic Network, and the SPECIALIST lexicon
are part of the UMLS and are designed primarily for use by
system developers. They are meant to be consulted and used by
application programs to interpret and refine user queries, to
map the user’s terms to appropriate controlled vocabularies and
classification schemes, to interpret natural language, and to
assist in structured data creation. They are also useful as
reference tools for database builders, librarians and other
information professionals.

which sounds very promising given my current gig.

It also has some portions which are genericly useful for NLP work such as the Word Sense Disambiguation (WSD) Test Collection

You have to register to get a login, but they expose a couple of handy wsdls

curl 'https://login.nlm.nih.gov/auth?wsdl' > cas.wsdl
curl 'http://umlsks.nlm.nih.gov/UMLSKS/services/UMLSKSService?wsdl' > umlsks.wsdl

when wsdls go wrong

Apparently these WSDLS are olde timey:

 [ERROR] rpc/encoded wsdls are not supported in JAXWS 2.0.
   line 128 of file:/Users/brian/funk/umlsks/cas.wsdl 

So… you need to get crusty old Axis-1.4, et alia to start having fun:

mkdir lib
cd lib
mirror="http://mirrors.ibiblio.org/pub/mirrors/maven2"
wget ${mirror}/axis/axis/1.4/axis-1.4.jar
wget ${mirror}/commons-logging/commons-logging/1.1.1/commons-logging-1.1.1.jar
wget ${mirror}/commons-discovery/commons-discovery/0.2/commons-discovery-0.2.jar
wget ${mirror}/javax/xml/jaxrpc-api/1.1/jaxrpc-api-1.1.jar
wget ${mirror}/geronimo-spec/geronimo-spec-saaj/1.1-rc4/geronimo-spec-saaj-1.1-rc4.jar
wget ${mirror}/wsdl4j/wsdl4j/1.4/wsdl4j-1.4.jar
wget ${mirror}/javax/activation/activation/1.1/activation-1.1.jar
wget ${mirror}/javax/mail/mail/1.4/mail-1.4.jar
cd ..
classpath="$( echo lib/*.jar | tr ' ' ':' )"
bigJ="java -classpath ${classpath}"
wsdl2j="${bigJ} org.apache.axis.wsdl.WSDL2Java -o generated  -d Session -s -S true"
${wsdl2j} -Nurn:authorization.umlsks.nlm.nih.gov gov.nih.nlm.umlsks.authorization cas.wsdl
${wsdl2j} -Nurn:umlsks.nlm.nih.gov gov.nih.nlm.umlsks umlsks.wsdl
src="$( find generated -name '*.java' )"
javac -classpath ${classpath} ${src}
jar cf umlsks-1.0.jar -C generated gov
javadoc -d javadoc ${src}

Using it

There is good api documentation, and I have credentials just a short few hours after requesting them!

I just scraped out the sample code, removed the reference to “umlsRelease” and searched for “measles”

CUI='C0025010' CN='Measles Vaccine'
CUI='C0025007' CN='Measles'

Neato!

    public static void main( String a[] ) throws Exception {
        // Locate the authentication web service
        AuthorizationPortType authPortType = (
                new AuthorizationPortTypeServiceLocator().getAuthorizationPort( 
                  new URL( "https://login.nlm.nih.gov:443/auth" )
                )
        );

        Properties credentials = new Properties();
        credentials.load( new FileInputStream( "credentials.properties" ) );

        String pgt = authPortType.getProxyGrantTicket( 
                credentials.getProperty( "username" )
                , credentials.getProperty( "password" ) 
        );
        String proxyTicket = authPortType.getProxyTicket( 
                pgt, "http://umlsks.nlm.nih.gov" 
        );
        System.out.printf( "pgt='%s'\nticket='%s'\n", pgt, proxyTicket );

        // Locate the UMLSKS web service
        UMLSKSServicePortType umlsksService = (
          new UMLSKSServiceLocator().getUMLSKSServicePort(
             new URL( http://umlsks.nlm.nih.gov/UMLSKS/services/UMLSKSService" )
           )
        );

        // Build the request object
        ConceptIdExactRequest request = new ConceptIdExactRequest();
        request.setCasTicket( proxyTicket );
        request.setSearchString( "measles" );

        // Execute the operation
        ConceptIdGroup group = umlsksService.findCUIByExact(request);

        // Print the results
        Object[] contents = group.getContents();
        for (int i = 0; i < contents.length; i++) {
            ConceptId cid = ( ConceptId ) contents[ i ];
            System.out.printf( "CUI='%8s' CN='%s'\n", cid.getCUI(), cid.getCN() );
        }   
    }   

Just in case wordpress mangled it

It is pretty much just the example Client.java but it uses a properties file for the username/password instead of taking it from the CLI

Here is some good reading on UMLS

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: