GitHub - jina-ai/jina: Cloud-native neural search framework for 𝙖𝙣𝙮 kind of data - JOYK Joy of Geek, Geek News, Link all geek

Build Your First Jina App

Document, Executor, and Flow are the three fundamental concepts in Jina.

Leveraging these three components, we want to build an app that finds lines from a code snippet that are most similar to the query.

Preliminaries: character embedding, pooling, Euclidean distance

Copy-paste the minimum example below and run it:

import numpy as np
from jina import Document, DocumentArray, Executor, Flow, requests

class CharEmbed(Executor):  # a simple character embedding with mean-pooling
    offset = 32  # letter `a`
    dim = 127 - offset + 1  # last pos reserved for `UNK`
    char_embd = np.eye(dim) * 1  # one-hot embedding for all chars

    @requests
    def foo(self, docs: DocumentArray, **kwargs):
        for d in docs:
            r_emb = [ord(c) - self.offset if self.offset <= ord(c) <= 127 else (self.dim - 1) for c in d.text]
            d.embedding = self.char_embd[r_emb, :].mean(axis=0)  # average pooling

class Indexer(Executor):
    _docs = DocumentArray()  # for storing all documents in memory

    @requests(on='/index')
    def foo(self, docs: DocumentArray, **kwargs):
        self._docs.extend(docs)  # extend stored `docs`

    @requests(on='/search')
    def bar(self, docs: DocumentArray, **kwargs):
         docs.match(self._docs, metric='euclidean', limit=20)

f = Flow(port_expose=12345, protocol='http', cors=True).add(uses=CharEmbed, parallel=2).add(uses=Indexer)  # build a Flow, with 2 parallel CharEmbed, tho unnecessary
with f:
    f.post('/index', (Document(text=t.strip()) for t in open(__file__) if t.strip()))  # index all lines of _this_ file
    f.block()  # block for listening request

GitHub - jina-ai/jina: Cloud-native neural search framework for 𝙖𝙣𝙮 kind of data

Build Your First Jina App

Recommend

15种主要编程语言介绍及应用场合

Alzheimer's Drug Meme Stock Plummets After Data Manipulation Allegations

自带计算设备BYOD带来的安全问题探讨

融创新追求：更安全、更从容、更长期、更有价值

世茂2021上半年业绩稳增，销售均价17746元/平米

平安健康：有温度的互联网医疗健康服务平台

盛京银行发布2021年半年报：主营业务稳步增长资产负债结构不断优化

Joinsawp新一代去中心化交易所横空出世平台通证JOIN开启火爆认购

The World’s on Fire, Yet Australia Keeps Pumping Out the Gas

软件安全问题探究

About Joyk