Python API¶

pluckit's Python API is built around three types: Plucker (the entry point), Selection (a lazy query chain), and Pluckin (the extension point). Everything else is either a mutation class or a convenience wrapper around these.

from pluckit import Plucker, AstViewer

pluck = Plucker(code="src/**/*.py", plugins=[AstViewer])

`Plucker`¶

The entry point. Wraps a DuckDB connection, loads the sitting_duck extension on first use, and exposes methods for finding, viewing, and mutating code.

Constructor¶

Plucker(
    code: str | list[str] | None = None,
    *,
    plugins: list[Pluckin | type[Pluckin]] = (),
    repo: str | None = None,
    db: duckdb.DuckDBPyConnection | None = None,
    cache: bool | str = False,
)

Parameter	Description
`code`	Glob pattern(s) or explicit file list for the source corpus
`plugins`	Pluckin classes or instances to register
`repo`	Repository root for relative paths (default: current working directory)
`db`	An existing DuckDB connection to reuse (default: create a fresh one)
`cache`	Persistent AST cache. `True` → `.pluckit.duckdb` in repo root. `str` → custom path.

Persistent AST caching. When cache=True, pluckit opens a persistent DuckDB file (.pluckit.duckdb by default) and materializes read_ast output into per-pattern tables. Subsequent queries against the same pattern skip re-parsing. File-stat mtime checks drive incremental invalidation — only modified files are re-parsed. See [tool.pluckit] cache = true in pyproject.toml for the config path.

Methods¶

`find(selector: str) -> Selection`¶

Run a selector against the configured code corpus and return a lazy Selection:

fns = pluck.find(".fn:exported")

`view(selector: str, *, format: str = "markdown") -> str`¶

Render matched nodes as markdown (the AstViewer plugin must be registered). Returns a :class:View object — see below.

print(pluck.view(".fn#main { show: signature; }"))

`source(glob: str) -> Source`¶

Create a Source handle for ad-hoc queries against a different glob without creating a whole new Plucker.

`fts_collection(name: str) -> FtsCollection`¶

Get a handle to a named FTS collection. Requires fledgling. Returns an FtsCollection with .create(query) and .search(query) methods:

col = pluck.fts_collection("tools")
col.create("""
    SELECT 'search' AS id,
           'full-text BM25 search over code and docs' AS text,
           map{'kit': 'fledgling'} AS metadata
""")
results = col.search("search")
for id, text, metadata, score in results:
    print(f"{score:.2f}  {id}: {text}")

Each collection gets its own BM25 index with independent IDF statistics. The fixed schema is (id TEXT, text TEXT, metadata MAP(TEXT, TEXT)). Creating a collection is idempotent — it replaces the existing table and index if the collection already exists.

`pluckins` (property) `-> list[Pluckin]`¶

The loaded pluckin instances for this Plucker, in load order. Public since 0.13.0 (replaces reaching into the private _registry). Use it to introspect which plugins are active or to pull tool definitions a pluckin contributes:

for p in pluck.pluckins:
    print(p.name)
    # e.g. a pluckin may expose squackit_tools for MCP integration

`Selector`¶

A validated, serializable CSS-over-AST selector string. Subclasses str so it's backward-compatible everywhere a bare selector string is used today — all existing code like pluck.find(".fn:exported") keeps working.

from pluckit import Selector

s = Selector(".fn:exported")
assert isinstance(s, str)
assert s.is_valid

# Invalid selector resolving to nothing
Selector(".nonexistent_taxonomy_class").validate()  # raises PluckerError

Supports the standard serialization protocol:

Method	Purpose
`.to_dict()` / `.from_dict(d)`	`{"selector": "..."}` dict form
`.to_json()` / `.from_json(s)`	JSON string
`.to_argv()` / `.from_argv(tokens)`	CLI token list
`.validate()` / `.is_valid`	Compile-time check

`Selection`¶

A lazy DuckDB relation. Every method on Selection returns another Selection — nothing materializes until you call a terminal method.

Query composition¶

# Refine a selection
tests = pluck.find(".fn[name^=test_]")
without_try = tests.filter(".fn:not(:has(.try))")

# Navigate
classes = pluck.find(".cls")
methods = classes.descendants(".fn")

Method	Description
`find(sel)`	Refine the selection with another selector
`filter(sel)`	Alias for `find`; semantic clarity
`descendants(sel)`	Matches anywhere under the selection
`children(sel)`	Direct children only
`ancestors(sel)`	Walk up the AST
`siblings(sel)`	Nodes sharing a parent
`first()`, `last()`, `nth(n)`	Positional selection
`limit(n)`, `offset(n)`	Slice the result set

Terminal methods¶

These materialize the relation and return Python data:

Method	Returns	Description
`count()`	`int`	Number of matched nodes
`names()`	`list[str]`	Identifier names (deduplicated)
`files()`	`list[str]`	Distinct source files containing matches
`rows()`	`list[Node]`	Full AST rows with all sitting_duck cols
`read()`	`list[str]`	Raw source text of each matched node
`to_df()`	`pd.DataFrame`	Pandas DataFrame (requires pandas)

Mutation methods¶

Every mutation method returns a refreshed Selection (so you can chain further queries, though most callers don't). All mutations are transactional at the invocation level — the enclosing call is atomic, and multiple fluent mutations are independent transactions.

Method	Description
`replaceWith(text)`	Replace entire matched node
`replaceWith(old, new)`	String-level replace within matched node
`prepend(text)`	Prepend lines to the matched body
`append(text)`	Append lines to the matched body
`insertBefore(anchor, text)`	Insert lines before an anchor selector
`insertAfter(anchor, text)`	Insert lines after an anchor selector
`wrap(before, after)`	Wrap with surrounding text
`unwrap()`	Inverse of wrap
`addParam(param)`	Add a parameter to every matched function
`removeParam(name)`	Remove a parameter by name
`addArg(expr)`	Add an argument to every matched call
`removeArg(name)`	Remove a keyword argument by name
`rename(new_name)`	Rename the first name occurrence
`clearBody()`	Replace body with `pass` / `{}`
`remove()`	Delete the matched node
`patch(content)`	Apply a unified diff or raw replacement

Example:

pluck.find(".fn#validate_token").replaceWith(
    "return None",
    "raise ValueError('token required')",
)
pluck.find(".fn:exported").addParam("timeout: int = 30")

# Apply a unified diff
diff_content = open("refactor.patch").read()
pluck.find(".fn#handler").patch(diff_content)

# Apply raw replacement text (like replaceWith, but from external content)
new_code = open("patches/new_handler.py").read()
pluck.find(".fn#handler").patch(new_code)

patch(content) auto-detects unified diffs (by leading --- or diff --git) vs raw replacement text. For diffs, context lines must match exactly or a PluckerError is raised.

Reading matched source¶

for node in pluck.find(".fn#validate").rows():
    print(f"{node.file_path}:{node.start_line}")
    print(node.source_text)

Selection.rows() returns Node dataclasses with all of sitting_duck's columns — node_id, type, semantic_type, name, start_line, end_line, parent_id, flags, and the native extraction columns (signature_type, parameters, modifiers, annotations).

Module-level shortcuts¶

For one-shot queries you don't need a persistent Plucker for:

from pluckit import view, find

print(view(".fn#main { show: outline; }", code="src/**/*.py"))

for path, line, name in find(".fn:exported", code="src/**/*.py"):
    print(f"{path}:{line}:{name}")

These create an ephemeral Plucker, run the query, and tear it down.

`View` and `ViewBlock`¶

Plucker.view() and the module-level pluckit.view() return a View object — not a plain string. A View behaves like a string for the common "print the rendered markdown" case, but also exposes structured metadata about the blocks it contains.

from pluckit import Plucker, AstViewer, View, ViewBlock

pluck = Plucker(code="src/**/*.py", plugins=[AstViewer])
result: View = pluck.view(".fn:exported { show: signature; }")

# Rendered output — backward compatible with the v0.1 bare-string return
print(result)                    # prints the markdown
print(str(result))               # same thing
print(result.markdown)           # explicit accessor
assert "def authenticate" in result   # __contains__ checks the markdown

# Structured access
print(result.files)              # ['src/auth.py', 'src/users.py', ...]
print(len(result))               # number of blocks
for block in result:             # iterate as ViewBlock
    print(block.name, block.start_line, block.show)

# JSON export
import json
print(json.dumps(result.to_dict(), indent=2))

`View` methods and properties¶

Member	Type	Description
`markdown`	`str`	Full rendered output
`blocks`	`list[ViewBlock]`	Fresh list of contained blocks
`files`	`list[str]`	Distinct file paths, in first-seen order
`query`	`str`	The query string that produced this view
`format`	`str`	Output format (`markdown` in v0.1)
`to_dict()`	`dict`	JSON-serializable representation
`str(v)` / `print`	`str`	Same as `.markdown`
`len(v)`	`int`	Number of blocks
`bool(v)`	`bool`	`False` for empty views
`for b in v`	`Iterator[ViewBlock]`	Iterate blocks in render order
`v[i]` / `v[a:b]`	`ViewBlock` / `list`	Indexing and slicing
`"s" in v`	`bool`	Substring check against `.markdown`

`ViewBlock` fields¶

Each ViewBlock is a frozen dataclass with:

Field	Type	Description
`markdown`	`str`	Rendered content for this block
`rule`	`Rule`	The query rule that produced it
`show`	`str`	Resolved show mode (`body`, `signature`, …)
`file_path`	`str \\| None`	Source file — `None` for aggregates
`start_line`	`int \\| None`	Start line — `None` for aggregates
`end_line`	`int \\| None`	End line — `None` for aggregates
`name`	`str \\| None`	Identifier name, if any
`node_type`	`str \\| None`	AST node type (`function_definition`, …)
`language`	`str \\| None`	Source language
`is_aggregate`	`bool`	`True` for multi-match signature tables and such

Aggregate blocks. When a rule like .fn { show: signature; } matches many nodes, the viewer auto-collapses the output into a single markdown table. That collapse produces a single ViewBlock with is_aggregate = True and file_path, start_line, end_line all None. Use block.is_aggregate (or block.file_path is None) to distinguish per-node blocks from aggregates.

`Isolated`¶

A scope-aware extraction of a code block with its dependencies. Returned by Selection.isolate(). Identifies which identifiers the block reads from outside its own scope, classifies each as imported / parameter / builtin, and renders the result as a standalone function or a Jupyter cell.

from pluckit import Plucker

pluck = Plucker(code="src/**/*.py")
iso = pluck.find(".fn#outer").isolate()

iso.params         # ['helper']           — free variables → function params
iso.imports        # ['import json']      — import statements to prepend
iso.builtins_used  # ['len']              — builtins used (informational)
iso.body           # original block text

print(iso.as_function("extracted"))   # imports + def extracted(helper): ...
print(iso.as_jupyter_cell())          # imports + "# Required in scope: helper" + body

Fields¶

Field	Type	Description
`body`	`str`	Source text of the extracted block
`file_path`	`str`	Original source file
`start_line`	`int`	Start line (1-indexed)
`end_line`	`int`	End line (1-indexed, inclusive)
`language`	`str`	Source language (e.g., `"python"`)
`params`	`list[str]`	Free-variable names → function parameters
`imports`	`list[str]`	Import statements the block depends on
`builtins_used`	`list[str]`	Python builtins the block uses

Renderers¶

as_function(name="extracted") — standalone function: imports + def name(params) + body
as_jupyter_cell() — imports + # Required in scope: ... comment + inline body (no function wrap)

Serialization¶

to_dict / from_dict / to_json / from_json for MCP transport.

Limitations (v1)¶

Handles the first match only; for multi-match selections, iterate calls with .limit(1) narrowing
Detects module-level imports but not conditional / relative imports in edge cases
Assumes Python semantics for builtins (dir(builtins) + self/cls)

Chain¶

The Chain class is the programmatic equivalent of the CLI's chain syntax. It represents a source, a list of steps, and optional plugin configuration. Chains can be built from Python dicts, JSON strings, or parsed directly from sys.argv-style token lists.

`ChainStep`¶

A single operation in a chain:

from pluckit.chain import ChainStep

step = ChainStep(op="find", args=[".fn:exported"])
step = ChainStep(op="filter", kwargs={"min_lines": 10})
step = ChainStep(op="count")

Field	Type	Description
`op`	`str`	Operation name (e.g. `find`, `count`)
`args`	`list[str]`	Positional arguments (default: `[]`)
`kwargs`	`dict`	Keyword arguments (default: `{}`)

`Chain`¶

from pluckit.chain import Chain, ChainStep

chain = Chain(
    source=["src/**/*.py"],
    steps=[
        ChainStep(op="find", args=[".fn:exported"]),
        ChainStep(op="count"),
    ],
    plugins=["AstViewer"],
)

Field	Type	Description
`source`	`list[str]`	File paths or glob patterns
`steps`	`list[ChainStep]`	Ordered list of operations
`plugins`	`list[str]`	Plugin names to load (default: `[]`)
`dry_run`	`bool`	Preview changes without writing (default: `False`)
`diff`	`bool`	Output mutations as unified diff (default: `False`)

`Chain.MUTATION_OPS` (class attribute) `-> frozenset[str]`¶

The public, stable set of operation names that mutate source (public since 0.13.0; was _MUTATION_OPS). A chain containing any of these is a mutating chain — callers that gate writes (e.g. squackit blocks mutations unless allow_mutations=True) check membership rather than hard-coding the list:

from pluckit.chain import Chain

is_mutation = any(step.op in Chain.MUTATION_OPS for step in chain.steps)

The set: wrap, unwrap, append, prepend, insertBefore, insertAfter, replaceWith, remove, rename, patch, addArg, removeArg, addParam, removeParam.

Construction methods¶

`Chain.from_dict(data: dict) -> Chain`¶

Build a chain from a dictionary (the same structure as the JSON I/O format described in the CLI reference):

chain = Chain.from_dict({
    "source": ["src/**/*.py"],
    "steps": [
        {"op": "find", "args": [".fn:exported"]},
        {"op": "count"},
    ],
})

`Chain.from_json(json_string: str) -> Chain`¶

Parse a JSON string into a chain:

chain = Chain.from_json('{"source": ["src/**/*.py"], "steps": [{"op": "find", "args": [".fn:exported"]}, {"op": "count"}]}')

`Chain.from_argv(tokens: list[str]) -> Chain`¶

Parse a CLI-style token list into a chain. This is the same parsing the CLI entry point uses:

chain = Chain.from_argv(["src/**/*.py", "find", ".fn:exported", "count"])

Execution¶

`chain.evaluate() -> Any`¶

Run the chain and return the result. The return type depends on the terminal operation: int for count, list[str] for names, and so on.

chain = Chain.from_argv(["src/**/*.py", "find", ".fn:exported", "count"])
result = chain.evaluate()
print(result)  # e.g. 42

Serialization¶

`chain.to_dict() -> dict`¶

Convert the chain to a JSON-serializable dictionary:

data = chain.to_dict()
# {"source": ["src/**/*.py"], "plugins": [], "steps": [{"op": "find", "args": [".fn:exported"]}, {"op": "count"}]}

`chain.to_json() -> str`¶

Serialize the chain as a JSON string:

json_str = chain.to_json()

Pagination¶

Chains support limit, offset, and page as ordinary chain ops. When any of them appear in a chain, evaluate() attaches pagination metadata to the result:

chain = Chain(
    source=["src/**/*.py"],
    steps=[
        ChainStep(op="find", args=[".fn"]),
        ChainStep(op="page", args=["0", "20"]),  # page 0, size 20
        ChainStep(op="names"),
    ],
)
result = chain.evaluate()

result["page"]
# {
#   "offset": 0,
#   "limit": 20,
#   "total": None,        # lazy — call with_total() to fill in
#   "has_more": True,     # heuristic — True if data length >= limit
# }
result["source_chain"]    # the chain with pagination ops stripped — for "give me more"

`has_more` heuristic¶

data_length < limit → definitively False (got fewer than asked — no more)
data_length >= limit → conservatively True (might be the last page, but we can't know without total)
limit is None → has_more is None (unknown)

`Chain.with_total(result)` — compute the exact total on demand¶

Chain.with_total(result)  # mutates result in place, returns it
result["page"]["total"]   # now an int
result["page"]["has_more"] # now exact

Runs one extra SQL query against the source_chain. No-op if the result has no pagination metadata.

Each returns a new Chain ready to evaluate, or None when navigation isn't possible (no more pages / already at offset 0 / result wasn't paginated).

Method	Returns
`Chain.next_page(result)`	Chain for the next page (or None)
`Chain.prev_page(result)`	Chain for the previous page (or None)
`Chain.goto_page(result, n)`	Chain for page n (0-indexed)

result = chain.evaluate()
if next_chain := Chain.next_page(result):
    next_result = next_chain.evaluate()

Edge cases¶

page N SIZE + subsequent limit/offset — page sets both offset and limit; a later limit or offset overrides the corresponding value. Well-defined but confusing — use one pattern or the other, not both.
limit before a mutation — find .fn limit 5 rename bar renames only the first 5 functions. The Selection contains 5 rows at mutation time, so the mutation applies to those 5. Correct but may surprise callers who expected limit to apply only to terminal output.

Round-trip example¶

from pluckit.chain import Chain

# Build from CLI tokens
chain = Chain.from_argv(["src/**/*.py", "find", ".fn:exported", "names"])

# Inspect as JSON
print(chain.to_json())

# Reconstruct from the dict form
chain2 = Chain.from_dict(chain.to_dict())

# Execute
result = chain2.evaluate()
for name in result:
    print(name)

Plugins¶

pluckit is composable. Core capabilities live on Selection; anything that depends on extra infrastructure moves into a plugin.

from pluckit import Plucker, AstViewer, Calls, History, Scope

pluck = Plucker(
    code="src/**/*.py",
    plugins=[
        AstViewer,   # viewer with { show: ... } declarations
        Calls,       # call graph (callers / callees / references)
        History,     # git history via duck_tails
        Scope,       # scope-aware queries (defs / refs / enclosing scope)
    ],
)

Writing a plugin¶

A plugin is a subclass of pluckit.pluckins.Pluckin:

from pluckit.pluckins import Pluckin

class WordCount(Pluckin):
    name = "wordcount"

    methods = {
        "word_count": lambda self: sum(
            len(text.split()) for text in self.read()
        ),
    }

    pseudo_classes = {
        ":long": "end_line - start_line > 50",
    }

Class attribute	Purpose
`name`	Unique plugin identifier
`methods`	Dict of method name → function to install on `Selection`
`pseudo_classes`	Dict of `:name` → SQL WHERE fragment
`upgrades`	Dict of method name → function to override an existing method
`setup(ctx)`	Optional hook called when the plugin is registered

Plugins can also register new semantic-type aliases by updating pluckit.selectors.ALIASES, but that's considered advanced — most plugins only need methods and pseudo_classes.

`History` — git history on AST selections¶

from pluckit import Plucker, History

pluck = Plucker(code="src/**/*.py", plugins=[History])
fn = pluck.find(".fn#validate_token")

# Every commit that touched the function's file, most-recent-first
for commit in fn.history():
    print(f"{commit.hash[:8]} {commit.author_name}: {commit.message}")

# Distinct authors (email) for those commits
print(fn.authors())

# The function's body as it was at an old revision — AST-aware, so
# it matches by (name, type), not by today's line range.
print(fn.at("v0.1.0")[0])

# Unified diff between HEAD and the old revision, per matched node.
print(fn.diff("v0.1.0")[0])

Method	Returns	Notes
`history()`	`list[Commit]`	Deduplicated, sorted by date descending
`authors()`	`list[str]` (emails)	Sorted
`at(rev)`	`list[str]`	One entry per matched node; `""` if not found
`diff(rev)`	`list[str]`	Unified diff per matched node
`blame()`	(raises)	Deferred — upstream-blocked on `duck_tails`

Dependencies. History requires the duck_tails DuckDB community extension (for git_read) and the git binary on PATH (for git log --follow). pluckit auto-installs duck_tails on first use; run pluckit init to provision eagerly.

Rename handling. history() uses git log --follow, so commits that touched a file under a previous name are included. at(rev) / diff(rev) locate the node at the historical revision by name+type, so a pure rename is tracked as long as the node's name survives. Structural refactors (a method being pulled out of a class, a function being split) are not automatically tracked.

`Calls` — call-graph operations on selections¶

from pluckit import Plucker, Calls

pluck = Plucker(code="src/**/*.py", plugins=[Calls])

# Who calls validate_token?
callers = pluck.find(".fn#validate_token").callers()
print(callers.names())

# What does authenticate call?
callees = pluck.find(".fn#authenticate").callees()

# All references to a name (call sites + bare uses)
refs = pluck.find(".fn#config").references()

Method	Returns	Description
`callers()`	`Selection`	Functions that call matched nodes
`callees()`	`Selection`	Functions called by matched nodes
`references()`	`Selection`	All references to matched nodes

Dependencies. Calls wraps sitting_duck's ::callers / ::callees / ::references pseudo-elements. No extra extensions needed.

`Scope` — scope-aware queries¶

from pluckit import Plucker, Scope

pluck = Plucker(code="src/**/*.py", plugins=[Scope])

# Enclosing scope chain (module → class → function)
scope_chain = pluck.find(".fn#inner").scope()

# Names DEFINED in the scope containing each match
defs = pluck.find(".fn#outer").defs()

# Name REFERENCES within the scope containing each match
refs = pluck.find(".fn#outer").refs()

Method	Returns	Description
`scope()`	`Selection`	Enclosing scope hierarchy for each match
`defs()`	`Selection`	Definitions in the scope containing each match
`refs()`	`Selection`	References in the scope containing each match

Dependencies. Uses sitting_duck's ::scope pseudo-element and the scope_id / scope_stack columns on read_ast.

Error handling¶

Every recoverable error raises PluckerError:

from pluckit import Plucker, PluckerError

try:
    pluck = Plucker(code="src/**/*.py")
    pluck.find(".fn").replaceWith("def broken(:::")
except PluckerError as e:
    print(f"Mutation failed: {e}")
    # All affected files have already been rolled back to their
    # pre-mutation state.

PluckerError is raised for:

Failed extension installation (pluckit init will reproduce this)
Selector compilation errors
Mutation syntax errors (with automatic rollback)
Invalid paths, missing files, parse failures

Python API¶

Plucker¶

Constructor¶

Methods¶

find(selector: str) -> Selection¶

view(selector: str, *, format: str = "markdown") -> str¶

source(glob: str) -> Source¶

fts_collection(name: str) -> FtsCollection¶

pluckins (property) -> list[Pluckin]¶

Selector¶

Selection¶

Query composition¶

Terminal methods¶

Mutation methods¶

Reading matched source¶

Module-level shortcuts¶

View and ViewBlock¶

View methods and properties¶

ViewBlock fields¶

Isolated¶

Fields¶

Renderers¶

Serialization¶

Limitations (v1)¶

Chain¶

ChainStep¶

Chain¶

Chain.MUTATION_OPS (class attribute) -> frozenset[str]¶

Construction methods¶

Chain.from_dict(data: dict) -> Chain¶

Chain.from_json(json_string: str) -> Chain¶

Chain.from_argv(tokens: list[str]) -> Chain¶

Execution¶

chain.evaluate() -> Any¶

Serialization¶

chain.to_dict() -> dict¶

chain.to_json() -> str¶

Pagination¶

has_more heuristic¶

Chain.with_total(result) — compute the exact total on demand¶

Navigation helpers¶

Edge cases¶

Round-trip example¶

Plugins¶

Writing a plugin¶

History — git history on AST selections¶

Calls — call-graph operations on selections¶

Scope — scope-aware queries¶

Error handling¶

`Plucker`¶

`find(selector: str) -> Selection`¶

`view(selector: str, *, format: str = "markdown") -> str`¶

`source(glob: str) -> Source`¶

`fts_collection(name: str) -> FtsCollection`¶

`pluckins` (property) `-> list[Pluckin]`¶

`Selector`¶

`Selection`¶

`View` and `ViewBlock`¶

`View` methods and properties¶

`ViewBlock` fields¶

`Isolated`¶

`ChainStep`¶

`Chain`¶

`Chain.MUTATION_OPS` (class attribute) `-> frozenset[str]`¶

`Chain.from_dict(data: dict) -> Chain`¶

`Chain.from_json(json_string: str) -> Chain`¶

`Chain.from_argv(tokens: list[str]) -> Chain`¶

`chain.evaluate() -> Any`¶

`chain.to_dict() -> dict`¶

`chain.to_json() -> str`¶

`has_more` heuristic¶

`Chain.with_total(result)` — compute the exact total on demand¶

`History` — git history on AST selections¶

`Calls` — call-graph operations on selections¶

`Scope` — scope-aware queries¶