Natural Language Processing
We are currently conducting research and development in a wide range of areas in this field, from morphological analysis, named-entity recognition, dependency analysis and other essential technologies, to advanced applications such as text mining, information retrieval, information extraction and conversational processing.
Natural language processing is also actively used in machine learning, large-scale data and crowdsourcing.
We are carrying out research related to robust speech recognition for voice search and voice assistants, as well as spontaneous speech recognition for audio indexing.
We are working on the research and development of technology for extracting a variety of information from still images and videos. At Yahoo! JAPAN Research, we are placing special focus on the fields of large scale image similarity search and large scale specific object recognition.
Large Scale Image Similarity Search
Large Scale Specific Object Recognition
Nearest Neighbor Search Software for High Dimensional Vector Data
NGT - Neighborhood Graph and Tree for Indexing
This software provides commands and a library for performing high-speed approximate nearest neighbor searches against a large volume of data (several million to several 10 million items of data) in high dimensional vector data space (several ten to several thousand dimensions).NGT Site
We are conducting research and development related to user diversity analyses using large-scale search logs, and seek to develop human oriented search technologies that make use of such analyses.
In addition, we are making contributions to our various services by working together with the teams of Yahoo! Labs in Barcelona, Sunnyvale, New York and elsewhere in the world to study the latest Web search and query analysis technologies. Together with the Kawarabayashi Large Graph Project we have carried out fundamental researches on large graph analyses with regards to Web graphs, search log derived graphs and social networks. We are also working on technologies to verify the diversity and authenticity of the information that Yahoo! JAPAN provides through its Web search and other various services.
And the last but not least, we are working to create the future of Yahoo! Search together with you, who are reading this page right now!
Is being used with such services as Yahoo! Anshin Net.
Similar Text Search Technology
Machine learning is a research field that aims to obtain general knowledge from data, and is being applied to many other fields, including natural language processing, information retrieval, speech processing and image processing. Machine learning is mainly used in understanding the structure of data itself, and in predictions regarding unknown cases using past data.
Yahoo! JAPAN is not only developing new machine learning methods, but it is also using these methods to extract valuable information from the large volumes of diverse data generated by its large number of users, and improve actual advertising, recommendation and other services.
An enormous amount of data is accumulated daily at Yahoo! JAPAN, which boasts more than 100 smartphone apps and over 50 billion monthly page views. The data science field will be the driving force behind data usage inside the company, searching for a wide variety of contexts from among this vast and diverse data.
Recently we have been putting effort into using time series data (pattern extraction, tensors, etc.) and into the research field of heterogeneous data analysis (integrated data analysis).
Extracting Inter-personal Relationships from Pedometer Data (Joint Research With the Shimosaka Research Group, The University of Tokyo)
This is research that attempts to extract inter-personal relationships from only pedometer data. We were able to develop technology that can detect when members within a group accompany one another by looking for similarities between individuals’ step count patterns, and describe inter-personal relationships based on the frequency that members accompany one another. Please check here（external site） for more details.
Big Data Report
An attempt at representing social trends using data accumulated internally. Please check here for details.
We are also making use of location data and search and click logs in this research. We will post announcements when we reach a stage where we are able to do so.
Time series data, heterogeneous data, clustering, pattern extraction, tensors, machine learning, statistics
In this area, we aim to develop technology to provide higher-value services by fusing the services operated by Yahoo! JAPAN with valuable knowledge data accumulated by outside companies.
Themes we are working on include: large-scale cascade classifiers that support the construction and utilization of ontologies; conversational agent frameworks that place importance on expandability and flexibility in order to allow anyone to make use of knowledge data; and distributed RDF storage managers for using large-scale knowledge data to the fullest.
Security & Privacy
As big data, the internet of things, and social media change our daily life in cyberspace and physical space, new issues are emerging on information security and user privacy. To tackle and solve these issues, we are conducting research on identity and access management, usable and strong authentication, and trust management to develop a secure and privacy-friendly platform for context-aware applications in ubiquitous environments.
We are also working on standardization activities for developing password-free authentication specifications in international technology forums and designing a security architecture of trust-based identity management systems with governmental communities.
Next-generation User Interfaces and InteractionHCI
Interaction occurs between humans and machines when a person uses an application or device. This interaction and the user interfaces that elicit it are essential elements of easy to use services. We are carrying out research in this area that is related to the future of user interfaces and human computer interaction.
The act of creating necessary services, ideas and content through soliciting contributions from a large and unspecified group of people is called crowdsourcing: a portmanteau of the words “crowd” and “outsourcing” that was proposed by journalist Jeff Howe in 2006. In recent years, with the spread of the online community, services in a variety of fields that enlist the help of an undefined public are drawing attention.
Yahoo! JAPAN also started its own crowdsourcing service, Yahoo! Crowdsourcing, in January 2013. It allows companies with problems, or “tasks,” to solve them with the help of Yahoo! JAPAN users, who can obtain T Points as a reward for their assistance. So far we have acquired over 200,000 registered users, and in less than a year since Yahoo! Crowdsourcing started it has become the largest crowdsourcing service in Japan.
Crowdsourcing is a new area of research that spans many fields, including education, economics, psychology, and computer science, because it exists at the point where the crowd and machines, and work and leisure intersect. We will work to connect the results of our research in this area to solutions to society’s issues by evolving crowdsourcing into an easy to use, more advanced division of labor system through investigating and testing for new tasks that crowdsourcing can solve.
Yahoo! JAPAN will make use of the results of this research in proposals to its clients and service improvements.