标题：Optimizing Box Embedding for Effective and Efficient Information Retrieval
摘要：Most existing information retrieval methods represent each query or document as a vector and use the inner product or Euclidean distance as the similarity measure so that relevance between queries and documents can be quantified. Based on the form of dense vectors, many indexing techniques are proposed to speed up the searching process by O(n). In this paper, we introduce a fundamentally different approach for this problem where the queries and documents are represented with box embeddings and the similarity between a query and a document is modeled by the intersection of two high dimensional boxes. Compared with vector representations, box embeddings can encode richer information effectively such as the uncertainty in parameter estimation and the semantic ranges of queries and documents. We proposed how to optimize box embeddings which can be utilized for introducing the box embedding-based indexing approaches and searching relevant documents of queries efficiently by sub-linear time. Experiments on public datasets show that the box embeddings and the box embedding-based indexing approaches are effective and efficient in ranking tasks.