sentence-transformersの簡単な例

code:usage.py

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")

sentences = [

"This framework generates embeddings for each input sentence",

"Sentences are passed as a list of string.",

"The quick brown fox jumps over the lazy dog.",

]

embeddings = model.encode(sentences)

for sentence, embedding in zip(sentences, embeddings):

print("Sentence:", sentence)

print("Embedding:", embedding)

print()

非常に小さくてぱっと動かしやすい！

コサイン類似度を出してみる

仮説：0文目と1文目は似ている。2文目は似ていない

code:python

>> from sklearn.metrics.pairwise import cosine_similarity

>> cosine_similarity(embeddings0.reshape(1, -1), embeddings1.reshape(1, -1))

array(0.53807926, dtype=float32)

>> cosine_similarity(embeddings0.reshape(1, -1), embeddings2.reshape(1, -1))

array(0.11805625, dtype=float32)

仮説は裏付けられた