网站收录很好没排名,建设网站总结,电力建设期刊网站,轻拟物风格WordPressFacebook AI Similarity Search (Faiss /Fez/) 是一个用于高效相似度搜索和密集向量聚类的库。它包含了在任意大小的向量集合中进行搜索的算法#xff0c;甚至可以处理可能无法完全放入内存的向量集合。它还包含用于评估和参数调整的支持代码。
Faiss 官方文档#xff1a;We…Facebook AI Similarity Search (Faiss /Fez/) 是一个用于高效相似度搜索和密集向量聚类的库。它包含了在任意大小的向量集合中进行搜索的算法甚至可以处理可能无法完全放入内存的向量集合。它还包含用于评估和参数调整的支持代码。
Faiss 官方文档Welcome to Faiss Documentation — Faiss documentation
下面展示如何使用与 FAISS 向量数据库相关的功能。它将展示特定于此集成的功能。在学习完这些内容后探索相关的用例页面可能会很有帮助以了解如何将这个向量存储作为更大链条的一部分来使用。
设置
该集成位于 langchain-community 包中。我们还需要安装 faiss 包本身。我们还将使用 OpenAI 进行嵌入因此需要安装这些要求。我们可以使用以下命令进行安装
pip install -U langchain-community faiss-cpu langchain-openai tiktoken
请注意如果您想使用启用了 GPU 的版本也可以安装 faiss-gpu。
设置 LangSmith 以获得最佳的可观测性也会很有帮助但不是必需的
# os.environ[LANGCHAIN_TRACING_V2] true
# os.environ[LANGCHAIN_API_KEY]
导入
在这里我们将文档导入到向量存储中。
#示例faiss_search.py
# 如果您需要使用没有 AVX2 优化的 FAISS 进行初始化请取消下面一行的注释
# os.environ[FAISS_NO_AVX2] 1
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
loader TextLoader(../../resource/knowledge.txt, encodingUTF-8)
documents loader.load()
text_splitter CharacterTextSplitter(chunk_size1500, chunk_overlap0)
docs text_splitter.split_documents(documents)
embeddings OpenAIEmbeddings()
db FAISS.from_documents(docs, embeddings)
print(db.index.ntotal)
输出结果
16
查询
现在我们可以查询向量存储。有几种方法可以做到这一点。最常见的方法是使用 similarity_search。
#示例faiss_search.py
query Pixar公司是做什么的?
docs db.similarity_search(query)
print(docs[0].page_content)
输出结果
During the next five years, I started a company named NeXT, another company named Pixar, and fell in love with an amazing woman who would become my wife. Pixar went on to create the worlds first computer animated feature film, Toy Story, and is now the most successful animation studio in the world. In a remarkable turn of events, Apple bought NeXT, I retuned to Apple, and the technology we developed at NeXT is at the heart of Apples current renaissance. And Laurene and I have a wonderful family together.
在接下来的五年里, 我创立了一个名叫 NeXT 的公司还有一个叫Pixar的公司然后和一个后来成为我妻子的优雅女人相识。Pixar 制作了世界上第一个用电脑制作的动画电影——“”玩具总动员”Pixar 现在也是世界上最成功的电脑制作工作室。在后来的一系列运转中Apple 收购了NeXT然后我又回到了苹果公司。我们在NeXT 发展的技术在 Apple 的复兴之中发挥了关键的作用。我还和 Laurence 一起建立了一个幸福的家庭。
作为检索器
我们还可以将向量存储转换为 Retriever 类。这使我们能够轻松地在其他 LangChain 方法中使用它这些方法主要用于检索器。
#示例faiss_retriever.py
retriever db.as_retriever()
docs retriever.invoke(query)
print(docs[0].page_content)
输出结果
During the next five years, I started a company named NeXT, another company named Pixar, and fell in love with an amazing woman who would become my wife. Pixar went on to create the worlds first computer animated feature film, Toy Story, and is now the most successful animation studio in the world. In a remarkable turn of events, Apple bought NeXT, I retuned to Apple, and the technology we developed at NeXT is at the heart of Apples current renaissance. And Laurene and I have a wonderful family together.
在接下来的五年里, 我创立了一个名叫 NeXT 的公司还有一个叫Pixar的公司然后和一个后来成为我妻子的优雅女人相识。Pixar 制作了世界上第一个用电脑制作的动画电影——“”玩具总动员”Pixar 现在也是世界上最成功的电脑制作工作室。在后来的一系列运转中Apple 收购了NeXT然后我又回到了苹果公司。我们在NeXT 发展的技术在 Apple 的复兴之中发挥了关键的作用。我还和 Laurence 一起建立了一个幸福的家庭。
带分数的相似度搜索
还有一些 FAISS 特定的方法。其中之一是 similarity_search_with_score它允许您返回文档以及查询与它们之间的距离分数。返回的距离分数是 L2 距离。因此得分越低越好。
#示例faiss_similarity.py
#返回文档以及查询与它们之间的距离分数。返回的距离分数是 L2 距离。因此得分越低越好。
docs_and_scores db.similarity_search_with_score(query)
print(docs_and_scores)
#还可以使用similarity_search_by_vector来搜索与给定嵌入向量相似的文档该函数接受一个嵌入向量作为参数而不是字符串。
embedding_vector embeddings.embed_query(query)
docs_and_scores db.similarity_search_by_vector(embedding_vector)
print(docs_and_scores)
[(Document(metadata{source: ../../resource/knowledge.txt}, page_contentDuring the next five years, I started a company named NeXT, another company named Pixar, and fell in love with an amazing woman who would become my wife. Pixar went on to create the worlds first computer animated feature film, Toy Story, and is now the most successful animation studio in the world. In a remarkable turn of events, Apple bought NeXT, I retuned to Apple, and the technology we developed at NeXT is at the heart of Apples current renaissance. And Laurene and I have a wonderful family together.\n在接下来的五年里, 我创立了一个名叫 NeXT 的公司还有一个叫Pixar的公司然后和一个后来成为我妻子的优雅女人相识。Pixar 制作了世界上第一个用电脑制作的动画电影——“”玩具总动员”Pixar 现在也是世界上最成功的电脑制作工作室。在后来的一系列运转中Apple 收购了NeXT然后我又回到了苹果公司。我们在NeXT 发展的技术在 Apple 的复兴之中发挥了关键的作用。我还和 Laurence 一起建立了一个幸福的家庭。), 0.3155345), (Document(metadata{source: ../../resource/knowledge.txt}, page_contentI was lucky – I found what I loved to do early in life. Woz and I started Apple in my parents garage when I was 20. We worked hard, and in 10 years Apple had grown from just the two of us in a garage into a billion company with over 4000 employees. We had just released our finest creation - the Macintosh - a year earlier, and I had just turned 30. And then I got fired. How can you get fired from a company you started? Well, as Apple grew we hired someone who I thought was very talented to run the company with me, and for the first year or so things went well. But then our visions of the future began to diverge and eventually we had a falling out. When we did, our Board of Directors sided with him. So at 30 I was out. And very publicly out. What had been the focus of my entire adult life was gone, and it was devastating.\n我非常幸运因为我在很早的时候就找到了我钟爱的东西。沃兹和我在二十岁的时候就在父母的车库里面开创了苹果公司。我们工作得很努力十年之后这个公司从那两个车库中的穷光蛋发展到了超过四千名的雇员、价值超过二十亿的大公司。在公司成立的第九年我们刚刚发布了最好的产品那就是 Macintosh。我也快要到三十岁了。在那一年我被炒了鱿鱼。你怎么可能被你自己创立的公司炒了鱿鱼呢嗯在苹果快速成长的时候我们雇用了一个很有天分的家伙和我一起管理这个公司在最初的几年公司运转的很好。但是后来我们对未来的看法发生了分歧, 最终我们吵了起来。当争吵不可开交的时候董事会站在了他的那一边。所以在三十岁的时候我被炒了。在这么多人的眼皮下我被炒了。在而立之年我生命的全部支柱离自己远去这真是毁灭性的打击。), 0.44481623), (Document(metadata{source: ../../resource/knowledge.txt}, page_contentI really didnt know what to do for a few months. I felt that I had let the previous generation of entrepreneurs down - that I had dropped the baton as it was being passed to me. I met with David Packard and Bob Noyce and tried to apologize for screwing up so badly. I was a very public failure, and I even thought about running away from the valley. But something slowly began to dawn on me – I still loved what I did. The turn of events at Apple had not changed that one bit. I had been rejected, but I was still in love. And so I decided to start over.\n在最初的几个月里我真是不知道该做些什么。我把从前的创业激情给丢了我觉得自己让与我一同创业的人都很沮丧。我和 David Pack 和 Bob Boyce 见面并试图向他们道歉。我把事情弄得糟糕透顶了。但是我渐渐发现了曙光我仍然喜爱我从事的这些东西。苹果公司发生的这些事情丝毫的没有改变这些一点也没有。我被驱逐了但是我仍然钟爱它。所以我决定从头再来。\n\nI didnt see it then, but it turned out that getting fired from Apple was the best thing that could have ever happened to me. The heaviness of being successful was replaced by the lightness of being a beginner again, less sure about everything. It freed me to enter one of the most creative periods of my life.\n我当时没有觉察但是事后证明从苹果公司被炒是我这辈子发生的最棒的事情。因为作为一个成功者的极乐感觉被作为一个创业者的轻松感觉所重新代替对任何事情都不那么特别看重。这让我觉得如此自由进入了我生命中最有创造力的一个阶段。), 0.46826816), (Document(metadata{source: ../../resource/knowledge.txt}, page_contentIm pretty sure none of this would have happened if I hadnt been fired from Apple. It was awful tasting medicine, but I guess the patient needed it. Sometimes life hits you in the head with a brick. Dont lose faith. Im convinced that the only thing that kept me going was that I loved what I did. Youve got to find what you love. And that is as true for your work as it is for your lovers. Your work is going to fill a large part of your life, and the only way to be truly satisfied is to do what you believe is great work. And the only way to do great work is to love what you do. If you havent found it yet, keep looking. Dont settle. As with all matters of the heart, youll know when you find it. And, like any great relationship, it just gets better and better as the years roll on. So keep looking until you find it. Dont settle.\n我可以非常肯定,如果我不被苹果公司开除的话这其中一件事情也不会发生的。这个良药的味道实在是太苦了但是我想病人需要这个药。有些时候生活会拿起一块砖头向你的脑袋上猛拍一下。不要失去信心我很清楚唯一使我一直走下去的就是我做的事情令我无比钟爱。你需要去找到你所爱的东西对于工作是如此对于你的爱人也是如此。你的工作将会占据生活中很大的一部分。你只有相信自己所做的是伟大的工作你才能怡然自得。如果你现在还没有找到那么继续找、不要停下来、全心全意的去找当你找到的时候你就会知道的。就像任何真诚的关系随着岁月的流逝只会越来越紧密。所以继续找直到你找到它不要停下来。\n\nMy third story is about death.\n我的第三个故事是关于死亡的。), 0.4740282)]
保存和加载
您还可以保存和加载 FAISS 索引。这样做很有用因为您不必每次使用时都重新创建它。
#示例faiss_save.py
#保存索引
db.save_local(faiss_index)
#读取索引
new_db FAISS.load_local(faiss_index, embeddings,allow_dangerous_deserializationTrue)
docs new_db.similarity_search(query)
docs[0]
输出结果
page_contentDuring the next five years, I started a company named NeXT, another company named Pixar, and fell in love with an amazing woman who would become my wife. Pixar went on to create the worlds first computer animated feature film, Toy Story, and is now the most successful animation studio in the world. In a remarkable turn of events, Apple bought NeXT, I retuned to Apple, and the technology we developed at NeXT is at the heart of Apples current renaissance. And Laurene and I have a wonderful family together.
在接下来的五年里, 我创立了一个名叫 NeXT 的公司还有一个叫Pixar的公司然后和一个后来成为我妻子的优雅女人相识。Pixar 制作了世界上第一个用电脑制作的动画电影——“”玩具总动员”Pixar 现在也是世界上最成功的电脑制作工作室。在后来的一系列运转中Apple 收购了NeXT然后我又回到了苹果公司。我们在NeXT 发展的技术在 Apple 的复兴之中发挥了关键的作用。我还和 Laurence 一起建立了一个幸福的家庭。 metadata{source: ../../resource/knowledge.txt}