본문으로 건너뛰기

{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}

Pinecone

생산 AI 응용 프로그램에 대한 관리 된 벡터 데이터베이스. 완전 관리, 자동 확장, 하이브리드 검색 (dense + sparse), 메타 데이터 필터링 및 네임스페이스. 낮은 대기 시간 (< 100ms p95). 생산 RAG, 권장 시스템, 또는 스케일 검색에 사용됩니다. Serverless, 관리된 인프라에 가장 적합합니다.

기술 메타데이터

소스선택 사항 - hermes skills install official/mlops/pinecone로 설치
경로optional-skills/mlops/pinecone
버전1.0.0
저자Orchestra Research
라이선스MIT
플랫폼linux, macos, windows
태그RAG, Pinecone, Vector Database, Managed Service, Serverless, Hybrid Search, Production, Auto-Scaling, Low Latency, Recommendations

참고: 전체 SKILL.md

정보

아래는 Hermes가 이 스킬을 활성화할 때 로드하는 원문 SKILL.md 정의입니다. 명령어, 코드, 식별자를 정확히 보존하기 위해 이 참조 블록은 원문을 유지합니다.

# Pinecone - Managed Vector Database

The vector database for production AI applications.

## When to use Pinecone

**Use when:**
- Need managed, serverless vector database
- Production RAG applications
- Auto-scaling required
- Low latency critical (&lt;100ms)
- Don't want to manage infrastructure
- Need hybrid search (dense + sparse vectors)

**Metrics**:
- Fully managed SaaS
- Auto-scales to billions of vectors
- **p95 latency &lt;100ms**
- 99.9% uptime SLA

**Use alternatives instead**:
- **Chroma**: Self-hosted, open-source
- **FAISS**: Offline, pure similarity search
- **Weaviate**: Self-hosted with more features

## Quick start

### Installation

```bash
pip install pinecone-client
```

### Basic usage

```python
from pinecone import Pinecone, ServerlessSpec

# Initialize
pc = Pinecone(api_key="your-api-key")

# Create index
pc.create_index(
name="my-index",
dimension=1536, # Must match embedding dimension
metric="cosine", # or "euclidean", "dotproduct"
spec=ServerlessSpec(cloud="aws", region="us-east-1")
)

# Connect to index
index = pc.Index("my-index")

# Upsert vectors
index.upsert(vectors=[
&#123;"id": "vec1", "values": [0.1, 0.2, ...], "metadata": &#123;"category": "A"&#125;&#125;,
&#123;"id": "vec2", "values": [0.3, 0.4, ...], "metadata": &#123;"category": "B"&#125;&#125;
])

# Query
results = index.query(
vector=[0.1, 0.2, ...],
top_k=5,
include_metadata=True
)

print(results["matches"])
```

## Core operations

### Create index

```python
# Serverless (recommended)
pc.create_index(
name="my-index",
dimension=1536,
metric="cosine",
spec=ServerlessSpec(
cloud="aws", # or "gcp", "azure"
region="us-east-1"
)
)

# Pod-based (for consistent performance)
from pinecone import PodSpec

pc.create_index(
name="my-index",
dimension=1536,
metric="cosine",
spec=PodSpec(
environment="us-east1-gcp",
pod_type="p1.x1"
)
)
```

### Upsert vectors

```python
# Single upsert
index.upsert(vectors=[
&#123;
"id": "doc1",
"values": [0.1, 0.2, ...], # 1536 dimensions
"metadata": &#123;
"text": "Document content",
"category": "tutorial",
"timestamp": "2025-01-01"
&#125;
&#125;
])

# Batch upsert (recommended)
vectors = [
&#123;"id": f"vec&#123;i&#125;", "values": embedding, "metadata": metadata&#125;
for i, (embedding, metadata) in enumerate(zip(embeddings, metadatas))
]

index.upsert(vectors=vectors, batch_size=100)
```

### Query vectors

```python
# Basic query
results = index.query(
vector=[0.1, 0.2, ...],
top_k=10,
include_metadata=True,
include_values=False
)

# With metadata filtering
results = index.query(
vector=[0.1, 0.2, ...],
top_k=5,
filter=&#123;"category": &#123;"$eq": "tutorial"&#125;&#125;
)

# Namespace query
results = index.query(
vector=[0.1, 0.2, ...],
top_k=5,
namespace="production"
)

# Access results
for match in results["matches"]:
print(f"ID: &#123;match['id']&#125;")
print(f"Score: &#123;match['score']&#125;")
print(f"Metadata: &#123;match['metadata']&#125;")
```

### Metadata filtering

```python
# Exact match
filter = &#123;"category": "tutorial"&#125;

# Comparison
filter = &#123;"price": &#123;"$gte": 100&#125;&#125; # $gt, $gte, $lt, $lte, $ne

# Logical operators
filter = &#123;
"$and": [
&#123;"category": "tutorial"&#125;,
&#123;"difficulty": &#123;"$lte": 3&#125;&#125;
]
&#125; # Also: $or

# In operator
filter = &#123;"tags": &#123;"$in": ["python", "ml"]&#125;&#125;
```

## Namespaces

```python
# Partition data by namespace
index.upsert(
vectors=[&#123;"id": "vec1", "values": [...]&#125;],
namespace="user-123"
)

# Query specific namespace
results = index.query(
vector=[...],
namespace="user-123",
top_k=5
)

# List namespaces
stats = index.describe_index_stats()
print(stats['namespaces'])
```

## Hybrid search (dense + sparse)

```python
# Upsert with sparse vectors
index.upsert(vectors=[
&#123;
"id": "doc1",
"values": [0.1, 0.2, ...], # Dense vector
"sparse_values": &#123;
"indices": [10, 45, 123], # Token IDs
"values": [0.5, 0.3, 0.8] # TF-IDF scores
&#125;,
"metadata": &#123;"text": "..."&#125;
&#125;
])

# Hybrid query
results = index.query(
vector=[0.1, 0.2, ...],
sparse_vector=&#123;
"indices": [10, 45],
"values": [0.5, 0.3]
&#125;,
top_k=5,
alpha=0.5 # 0=sparse, 1=dense, 0.5=hybrid
)
```

## LangChain integration

```python
from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings

# Create vector store
vectorstore = PineconeVectorStore.from_documents(
documents=docs,
embedding=OpenAIEmbeddings(),
index_name="my-index"
)

# Query
results = vectorstore.similarity_search("query", k=5)

# With metadata filter
results = vectorstore.similarity_search(
"query",
k=5,
filter=&#123;"category": "tutorial"&#125;
)

# As retriever
retriever = vectorstore.as_retriever(search_kwargs=&#123;"k": 10&#125;)
```

## LlamaIndex integration

```python
from llama_index.vector_stores.pinecone import PineconeVectorStore

# Connect to Pinecone
pc = Pinecone(api_key="your-key")
pinecone_index = pc.Index("my-index")

# Create vector store
vector_store = PineconeVectorStore(pinecone_index=pinecone_index)

# Use in LlamaIndex
from llama_index.core import StorageContext, VectorStoreIndex

storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
```

## Index management

```python
# List indices
indexes = pc.list_indexes()

# Describe index
index_info = pc.describe_index("my-index")
print(index_info)

# Get index stats
stats = index.describe_index_stats()
print(f"Total vectors: &#123;stats['total_vector_count']&#125;")
print(f"Namespaces: &#123;stats['namespaces']&#125;")

# Delete index
pc.delete_index("my-index")
```

## Delete vectors

```python
# Delete by ID
index.delete(ids=["vec1", "vec2"])

# Delete by filter
index.delete(filter=&#123;"category": "old"&#125;)

# Delete all in namespace
index.delete(delete_all=True, namespace="test")

# Delete entire index
index.delete(delete_all=True)
```

## Best practices

1. **Use serverless** - Auto-scaling, cost-effective
2. **Batch upserts** - More efficient (100-200 per batch)
3. **Add metadata** - Enable filtering
4. **Use namespaces** - Isolate data by user/tenant
5. **Monitor usage** - Check Pinecone dashboard
6. **Optimize filters** - Index frequently filtered fields
7. **Test with free tier** - 1 index, 100K vectors free
8. **Use hybrid search** - Better quality
9. **Set appropriate dimensions** - Match embedding model
10. **Regular backups** - Export important data

## Performance

| Operation | Latency | Notes |
|-----------|---------|-------|
| Upsert | ~50-100ms | Per batch |
| Query (p50) | ~50ms | Depends on index size |
| Query (p95) | ~100ms | SLA target |
| Metadata filter | ~+10-20ms | Additional overhead |

## Pricing (as of 2025)

**Serverless**:
- $0.096 per million read units
- $0.06 per million write units
- $0.06 per GB storage/month

**Free tier**:
- 1 serverless index
- 100K vectors (1536 dimensions)
- Great for prototyping

## Resources

- **Website**: https://www.pinecone.io
- **Docs**: https://docs.pinecone.io
- **Console**: https://app.pinecone.io
- **Pricing**: https://www.pinecone.io/pricing