简介
deepdiff 是一个功能强大的 Python 库,用于深度比较任意 Python 对象的差异,支持字典、列表、集合、字符串、自定义类实例、numpy 数组等。它不只是简单的 == 比较,而是能生成结构化的差异报告,精确描述"哪里变了、怎么变的"。
除了差异比较,deepdiff 还提供搜索、哈希、差异补丁等实用工具模块,适用于配置变更检测、接口测试断言、数据版本对比等场景。
当前版本:9.0.0(2026-03-30)
Python 要求:>= 3.10
许可证:MIT
1. 安装
# 基础安装
pip install deepdiff
# 支持命令行使用
pip install "deepdiff[cli]"
# 性能优化版(含 orjson 等加速库)
pip install "deepdiff[optimize]"
# 安装全部可选依赖
pip install "deepdiff[all]"2. 模块概览
deepdiff 提供以下核心模块:
3. DeepDiff —— 差异比较
3.1 基本用法
from deepdiff import DeepDiff
t1 = {"name": "Alice", "age": 30, "skills": ["Python", "SQL"]}
t2 = {"name": "Bob", "age": 30, "skills": ["Python", "Java"]}
diff = DeepDiff(t1, t2)
print(diff)
# {
# 'values_changed': {
# "root['name']": {'new_value': 'Bob', 'old_value': 'Alice'},
# "root['skills'][1]": {'new_value': 'Java', 'old_value': 'SQL'}
# }
# }3.2 常用参数说明
3.3 列表对比
list1 = [1, 3, 5, {"a": 1}]
list2 = [1, 5, 3, {"a": 2}]
# 严格顺序(默认)
diff1 = DeepDiff(list1, list2)
print(diff1)
# 忽略顺序
diff2 = DeepDiff(list1, list2, ignore_order=True)
print(diff2)
# {'values_changed': {"root[3]['a']": {'new_value': 2, 'old_value': 1}}}3.4 集合对比
set1 = {1, 3, 5}
set2 = {2, 3, 4}
diff = DeepDiff(set1, set2)
print(diff)
# {
# 'set_item_added': {'root[2]', 'root[4]'},
# 'set_item_removed': {'root[1]', 'root[5]'}
# }3.5 自定义对象对比
class User:
def __init__(self, name, roles):
self.name = name
self.roles = roles
user1 = User("Alice", ["admin", "editor"])
user2 = User("Bob", ["admin", "viewer"])
diff = DeepDiff(user1, user2)
print(diff)
# {
# 'values_changed': {
# 'root.name': {'new_value': 'Bob', 'old_value': 'Alice'},
# 'root.roles[1]': {'new_value': 'viewer', 'old_value': 'editor'}
# }
# }3.6 排除指定路径
t1 = {"system": {"version": "1.0", "config": {"timeout": 30}}}
t2 = {"system": {"version": "2.0", "config": {"timeout": 30}}}
# 忽略 version 字段,只关心 config 变化
diff = DeepDiff(t1, t2, exclude_paths=["root['system']['version']"])
print(diff) # {} 无差异3.7 浮点数精度控制
t1 = {"score": 1.1234567}
t2 = {"score": 1.1234568}
# 精确到小数点后 5 位
diff = DeepDiff(t1, t2, significant_digits=5)
print(diff) # {} 无差异
diff2 = DeepDiff(t1, t2, significant_digits=7)
print(diff2) # values_changed3.8 结果输出格式
diff = DeepDiff(t1, t2)
# 转为普通字典
d = diff.to_dict()
# 转为 JSON 字符串
j = diff.to_json(indent=4)
# 美观打印(人类可读)
print(diff.pretty())4. DeepSearch —— 深度搜索
在嵌套对象中搜索指定的键或值,返回匹配路径列表。
from deepdiff import grep
obj = {"users": [{"name": "Alice"}, {"name": "Bob"}], "admin": "Alice"}
# 搜索值 "Alice"
result = grep(obj, "Alice")
print(result)
# {
# 'matched_paths': ["root['admin']"],
# 'matched_values': ["root['users'][0]['name']"]
# }
# 搜索键名 "name"
result2 = grep(obj, "name", verbose_level=2)
print(result2)5. DeepHash —— 内容哈希
将任意 Python 对象(包括不可哈希的 dict、list 等)转换为哈希值,可用于快速判断两个对象是否完全相同。
from deepdiff import DeepHash
data1 = {"list": [1, 2, 3], "nested": {"key": "value"}}
data2 = {"list": [1, 2, 3], "nested": {"key": "value"}}
data3 = {"list": [1, 2, 4], "nested": {"key": "value"}}
h1 = DeepHash(data1)
h2 = DeepHash(data2)
h3 = DeepHash(data3)
print(h1[data1] == h2[data2]) # True —— 内容相同
print(h1[data1] == h3[data3]) # False —— 内容不同6. Delta —— 差异补丁
将两个对象的差异存储为 Delta 对象,可以序列化保存,也可以叠加到其他对象上,实现"增量更新"。
from deepdiff import DeepDiff, Delta
t1 = {"x": 1, "y": [1, 2]}
t2 = {"x": 2, "y": [1, 3]}
# 生成差异补丁
diff = DeepDiff(t1, t2)
delta = Delta(diff)
# 将补丁应用到 t1,得到 t2
result = delta + t1
print(result) # {'x': 2, 'y': [1, 3]}
print(result == t2) # True
# 序列化与反序列化
serialized = delta.dumps() # 序列化为字节
delta2 = Delta(serialized) # 反序列化还原⚠️ 加载不可信来源的 Delta 时,建议设置
safe_to_import=True限制可执行的模块,防止安全风险。
7. extract —— 路径提取
通过路径字符串从嵌套对象中精准提取值,路径格式与 DeepDiff 输出结果一致,方便联动使用。
from deepdiff import extract
data = {"users": [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]}
# 提取指定路径的值
value = extract(data, "root['users'][0]['name']")
print(value) # "Alice"
# 与 DeepDiff 联动:找到差异路径后直接提取
diff = DeepDiff(t1, t2)
for path in diff.get("values_changed", {}):
old = extract(t1, path)
new = extract(t2, path)
print(f"{path}: {old} → {new}")8. 典型应用场景
8.1 接口测试断言
expected = generate_expected_response()
actual = api.call()
diff = DeepDiff(
expected, actual,
ignore_order=True,
exclude_regex_paths=[r"root[d+].timestamp"] # 排除时间戳字段
)
assert not diff, f"接口返回与预期不符:
{diff.pretty()}"8.2 配置变更检测
config_old = load_config("v1.yaml")
config_new = load_config("v2.yaml")
diff = DeepDiff(
config_old, config_new,
exclude_paths=["root['updated_at']"]
)
if diff:
send_alert(f"配置发生变更:{diff.to_json(indent=2)}")8.3 数据库记录变更审计
before = record.to_dict()
record.update(new_data)
after = record.to_dict()
diff = DeepDiff(before, after, exclude_paths=["root['updated_at']"])
if diff:
AuditLog.create(user=current_user, changes=diff.to_json())9. 性能优化建议
排除不关心的字段:用
exclude_paths或exclude_regex_paths跳过无需比较的字段,减少递归开销。限制递归深度:对超大对象设置
max_depth,避免深层递归消耗过多内存。忽略顺序时设置相似度阈值:使用
cutoff_intersection_for_pairs控制列表元素匹配策略,避免 O(n²) 的全量配对。安装性能优化版:
pip install "deepdiff[optimize]"引入orjson等加速序列化。浮点数用 significant_digits:避免因浮点精度误差产生大量虚假差异。