Internship - web portal development
The EU COST Network, TextLink, is looking for an intern to build a ``Proof of Concept'' web portal, to provide access to resources developed through the network and to implement a small suite of search, visualization and dissemination capabilities that would be of value to the network and to stake holders in the enterprise. This web portal will contribute to achieving a prime objective of TextLink, that of coordinating the identification and development of monolingual and parallel corpora that have been enriched and made inter-operable and co-searchable through the annotation of discourse-relational devices (DRDs). The internship can be held at any of the following institutions: - Charles University, Prague - University of Toulouse - Cyprus University of Technology - Mykolas Romeris University, Vilnius - University of Edinburgh - University of Potsdam - University of Valencia - University of Wolverhampton - University Politehnica, Bucharest The internship can run from two to six months, starting any time after 1 June 2017 and ending before 31 March 2018. Financial support will be available (covering costs of accommodation, meals, and transport), up to a maximum of 5,000 euro for 6-month internship. The exact amount will be determined based on the length of the stay. For further information about holding an internship at one of these institutions, please contact Professor Jacqueline Visconti (email@example.com). ELIGIBILITY Internships are open to researchers holding (or about to be awarded) an MSc degree or higher. Researchers of any nationality can apply, as long as their current primary affiliation is within a member country of TextLink: Belgium, Croatia, Cyprus, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Israel, Italy, Lithuania, Netherlands, Norway, Poland, Portugal, Romania, Slovenia, Spain, Sweden, Switzerland, Turkey, UK. The internship must be held in a country other than that of the applicant's primary affiliation. It is also possible to split the internship, to divide the time between two institutions. Candidates for the position must provide evidence of skills in web site development using PHP and MySQL, as well as knowledge of a language for handling text data (perl and/or python). We especially welcome applications from candidates interested in working with linguists, language researchers and/or professional translators. APPLICATION PROCEDURE Please submit the following information to Professor Jacqueline Visconti (firstname.lastname@example.org) by 15 May 2017: - a CV; - contact information for two people able to provide references; - a statement of where you would like to hold the internship, your preferred dates, and the length of your preferred internship (between two and six months); - a preliminary work plan based on the tasks to be undertaken and completed, with an estimated timeline for each task, taking into account the proposed period of the internship. TASKS RELATED TO THE WEB PORTAL The first task is an essential one: To create a web portal hosted at a TextLink member institution, linked to a set of downloadable resources. Other desired capabilities include the following (note that content for these capabilities will be provided by TextLink members):
- Making a set of annotated corpora interoperable by mapping labels in one framework to exact or close/approximate matches in each of the other frameworks.
- Enabling search of multiple corpora for all tokens tagged with a given sense or for a specified token tagged with a given sense, as well as enabling further processing of the retrieved tokens, including collecting them into a set, along with a specifiable amount of context and meta-data (such as genre/period/etc.) which have been migrated down to the token level; visualizing the set of tokens and their contexts; computing statistics over the set; downloading the set in a specifiable format (such as a CSV file).
- Providing access to all the lexicons annotated with Discourse Relational Devices (DRDs) that have been compiled to date in the Potsdam DIMLEX framework and a lexicon browser developed for DIMLEX. - Providing access to a dictionary of connectives in a specified language A and their translation into language B.
- Providing access to corpora with annotated DRDs: either the DRDs alone or sense-annotated tokens thereof. - Providing access to the TED-MDB or other corpora that can be aligned with respect to explicit DRDs. - For DRD-annotated corpora that have also been annotated with part-of-speech tags, parses, word senses, DRD position, order of the arguments to the DRD, argument span size, etc., enable the following: - adding PoS, syntactic and/or semantic constraints to the search specification - collecting this additional annotation along with the tokens - computing statistics on lexical, syntactic and semantic properties associated with the tokens - displaying all this information - downloading it in a specifiable format such as CSV.