Python编程代码规范指南

# Python编程代码规范指南

基于 PEP 8 (opens new window) 和 Google Python Style Guide (opens new window)。

# 目录

01.规范概述
- 1.1 为何需要 Python 规范
- 1.2 核心目标
02.命名规范
- 2.1 命名总表
- 2.2 命名反模式
03.代码格式
04.类型注解
05.字符串与文档
06.比较与条件
07.异常处理
08.列表与字典推导
09.上下文管理器
10.函数规范
- 10.1 参数与返回值
- 10.2 装饰器
11.类设计规范
- 11.1 dataclass-与-property
- 11.2 slots-与-抽象基类
12.并发与异步
13.性能优化
14.常见反模式
15.工具链与自动化
16.代码审查清单
17.常见陷阱速查

# 01.规范概述

# 1.1 为何需要 Python 规范

疑惑：Python 以简洁优雅著称，为什么还需要编码规范？

答疑：Python 的灵活性是一把双刃剑——同一个逻辑可以用 5 种风格表达。typing 还是注释式类型？f-string 还是 .format()？dataclass 还是手写 __init__？规范的本质是在灵活性中统一选择，让代码风格像一个模式的重复应用。

import this —— Python 之禅说："应该有一种——最好只有一种——明显的写法。" 规范就是帮你找到那一种。

# 1.2 核心目标

目标	说明
可读性	代码即文档，读代码像读英文
一致性	全项目像一个 Pythonista 写的
健壮性	类型注解 + mypy 在运行前消灭类型 bug
可维护性	6 个月后自己打开还能改

# 02.命名规范

# 2.1 命名总表

类型	规范	示例
模块/包	小写 + 下划线	`my_module`, `data_utils`
类	大驼峰	`MyClass`, `DataProcessor`
异常类	大驼峰 + `Error` 后缀	`ValidationError`
函数/方法	小写 + 下划线	`get_user()`, `calculate_total`
变量	小写 + 下划线	`user_count`, `max_retries`
常量	全大写 + 下划线	`MAX_SIZE`, `DEFAULT_TIMEOUT`
私有成员	单下划线前缀	`_internal_method`, `_private_var`
内部私有	双下划线前缀	`__very_private`（name mangling）
魔术方法	双下划线前后	`__init__`, `__str__`

# ✅ 正确
class UserProfile:
    MAX_NAME_LENGTH = 50

    def __init__(self, user_id: int):
        self._id = user_id
        self.name: str = ""

    def get_display_name(self) -> str:
        return self.name or f"User_{self._id}"

# 2.2 命名反模式

反模式	示例	改进
单字母	`u`, `f`, `x`	`user`, `file_path`
拼音	`dian_hua`	`phone_number`
类型冗余	`name_str`, `age_int`	`name`, `age`
否定命名	`is_not_empty`	`not items`（直接用 `if not items`）
双下划线滥用	`__my_method__`	`_my_method` 表示私有即可

# 03.代码格式

# 3.1 缩进与空格

# ✅ 4 个空格缩进（不要用 Tab）
def long_function_name(
    var_one: int,
    var_two: str,
    var_three: float,
) -> dict:
    result = {
        "one": var_one,
        "two": var_two,
    }
    return result

# ✅ 运算符两侧加空格
count = a + b * c
is_valid = (age >= 18) and (name is not None)

# ✅ 逗号后加空格
items = [1, 2, 3]
config = {"host": "localhost", "port": 8080}

# ❌ 括号内不加空格
items = [ 1, 2, 3 ]    # 错误
data = {"a" : 1}        # 错误

# 3.2 import 顺序

# ✅ 分组顺序：标准库 → 第三方 → 本地
import os
import sys
from typing import Optional

import requests
from flask import Flask

from myapp.utils import helper

# 3.3 空行与换行

# ✅ 类之间 2 空行，方法之间 1 空行
class FirstClass:
    def method_one(self):
        pass


class SecondClass:
    def method_one(self):
        pass

    # ✅ 方法内逻辑分组用 1 空行
    def process(self, data):
        cleaned = self._clean(data)

        validated = self._validate(cleaned)
        return self._execute(validated)

# 04.类型注解

# 4.1 基础类型注解

from typing import Optional

# ✅ 函数注解
def find_user(user_id: int) -> Optional[dict[str, object]]:
    """查找用户，找不到返回 None"""
    ...

# ✅ Python 3.10+ 内置泛型语法（推荐，不需要从 typing 导入）
def process(items: list[str]) -> dict[str, int]:
    ...

# ✅ 变量注解
users: list[str] = []
config: dict[str, str | int] = {}          # 3.10+ | 联合类型

# ✅ 类属性注解
class DataStore:
    cache: dict[str, bytes]
    max_size: int = 1024 * 1024

    def __init__(self) -> None:
        self.cache = {}

# ✅ 可调用对象
from collections.abc import Callable
Handler = Callable[[int, str], bool]

# ❌ 不要忽略类型注解
def find_user(user_id):  # 缺少注解
    ...

# 4.2 TypedDict 与 Protocol 【推荐】

from typing import TypedDict, Protocol, runtime_checkable

# ✅ TypedDict：描述 JSON / 字典结构
class UserDict(TypedDict):
    id: int
    name: str
    email: str | None

def parse_user(raw: dict) -> UserDict:
    return {"id": raw["id"], "name": raw["name"], "email": raw.get("email")}

# ✅ Protocol：结构化类型（鸭子类型的类型化版本）
@runtime_checkable
class HasName(Protocol):
    name: str

def greet(obj: HasName) -> str:
    return f"Hello, {obj.name}!"

# ✅ User 和 Admin 都满足 HasName 协议，不需要显式继承
greet(User(name="yc"))       # ✅
greet(Admin(name="admin"))   # ✅

# ❌ 不满足协议的对象 → mypy 报错
# greet(42)                   # ❌ int 没有 name 属性

# 4.3 函数重载注解（`@overload`）【推荐】

from typing import overload

# ✅ 不同参数类型返回不同结果
@overload
def parse(data: str) -> int: ...
@overload
def parse(data: list[str]) -> list[int]: ...
def parse(data: str | list[str]) -> int | list[int]:
    if isinstance(data, str):
        return int(data)
    return [int(x) for x in data]

# mypy 能正确推断返回类型
a: int = parse("42")          # ✅
b: list[int] = parse(["1", "2"])  # ✅

# 05.字符串与文档

# ✅ 优先使用 f-string（Python 3.6+）
name = "World"
msg = f"Hello, {name}! Count: {count + 1}"

# ❌ 旧的 % 格式化
msg = "Hello, %s! Count: %d" % (name, count)

# ✅ 多行字符串
query = """
    SELECT id, name, email
    FROM users
    WHERE active = 1
    ORDER BY name
"""

# ✅ 长字符串拼接
msg = (
    f"User {user.name} has {len(orders)} orders "
    f"with total amount {total:.2f}"
)

# ✅ Docstring（Google 风格）
def calculate(price: float, tax: float = 0.1) -> float:
    """计算含税价格。

    Args:
        price: 原始价格
        tax: 税率，默认 0.1

    Returns:
        含税总价

    Raises:
        ValueError: price 为负数时
    """
    if price < 0:
        raise ValueError("price must be non-negative")
    return price * (1 + tax)

# 06.比较与条件

# ✅ 与 None 比较用 is
if user is None:
    return
if data is not None:
    process(data)

# ✅ 空序列判断（Python 的"假值"）
if not items:            # 而非 len(items) == 0
    return []
if not user_name:        # 空字符串
    user_name = "Unknown"

# ✅ 布尔值直接判断
if is_active:            # 而非 is_active == True
    ...

# ✅ 三目运算符
status = "active" if user.is_active else "inactive"

# ✅ 多条件用 in
if status in ("active", "pending"):
    ...

# ❌ 避免
if status == "active" or status == "pending":
    ...

# ✅ 链式比较
if 0 <= score <= 100:
    grade = "valid"

# 07.异常处理

# ✅ 捕获具体异常
try:
    result = api.call()
except ConnectionError as e:
    logger.error(f"API 连接失败: {e}")
    raise
except ValueError:
    return default_value
finally:
    cleanup()

# ❌ 不要捕获所有异常（除非确实需要）
try:
    ...
except Exception:          # 太宽泛
    pass                   # 吞掉异常更危险

# ✅ 自定义异常
class ValidationError(Exception):
    """数据校验异常。"""
    def __init__(self, field: str, msg: str) -> None:
        self.field = field
        super().__init__(f"{field}: {msg}")

# ✅ raise from 保留异常链
try:
    process_data()
except ValueError as e:
    raise AppError("处理失败") from e

# 08.列表与字典推导

# ✅ 推导式基本形式
squares = [x * x for x in range(10)]
active_users = [u for u in users if u.is_active]
name_map = {u.id: u.name for u in users}
unique_tags = {tag for article in articles for tag in article.tags}

# ✅ 生成器表达式（大数据量时用，不创建中间列表）
total = sum(x * x for x in big_dataset)

# ✅ 条件推导
values = [x if x > 0 else 0 for x in data]   # 带 else

# ❌ 推导嵌套太深（> 2 层）
# flattened = [y for x in outer for z in inner for y in z]  # 不可读
flat = []
for x in outer:
    for z in inner:
        flat.extend(z)           # ✅ 拆成显式循环更清晰

# 09.上下文管理器

# ✅ with 语句自动管理资源
with open("data.json", "r") as f:
    content = f.read()

# ✅ 嵌套上下文管理器
with open("src.txt") as src, open("dst.txt", "w") as dst:
    dst.write(src.read())

# ✅ 自定义上下文管理器
from contextlib import contextmanager

@contextmanager
def timer(name: str):
    import time
    start = time.time()
    yield
    elapsed = time.time() - start
    print(f"{name}: {elapsed:.2f}s")

with timer("计算阶段"):
    heavy_computation()

# 10.函数规范

# 10.1 参数与返回值

# ✅ 参数默认值用 None 而非可变对象
def add_item(item: str, items: list[str] | None = None) -> list[str]:
    if items is None:
        items = []
    items.append(item)
    return items

# ❌ 可变默认参数（经典陷阱）
def add_item(item, items=[]):  # 多次调用共享同一个 list！
    items.append(item)
    return items

# ✅ 仅关键字参数（*, 之后的参数必须用关键字传递）
def connect(host: str, port: int, *, timeout: int = 30, ssl: bool = True):
    ...

connect("localhost", 8080, timeout=60)      # ✅
connect("localhost", 8080, 60, True)         # ❌ 不清晰

# ✅ 函数体不超过 50 行
# ✅ 参数不超过 5 个，超过则封装为 dataclass

# 10.2 装饰器【推荐】

import functools
import time

# ✅ 自定义装饰器：用 functools.wraps 保留元信息
def retry(max_attempts: int = 3, delay: float = 1.0):
    """重试装饰器"""
    def decorator(func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            for i in range(max_attempts):
                try:
                    return func(*args, **kwargs)
                except Exception:
                    if i == max_attempts - 1:
                        raise
                    time.sleep(delay)
        return wrapper
    return decorator

@retry(max_attempts=5, delay=0.5)
def unstable_api():
    ...
# 保留原函数名：unstable_api.__name__ == 'unstable_api'  # ✅ wraps 保证

# ✅ 标准库优化工具
@functools.lru_cache(maxsize=128)     # 缓存函数结果
def fibonacci(n: int) -> int: ...

@functools.singledispatch             # 单分派泛型函数
def format_value(val) -> str: ...
@format_value.register(int)
def _(val: int) -> str: return f"数量: {val}"
@format_value.register(str)
def _(val: str) -> str: return f"名称: {val}"

# ❌ 装饰器中修改函数签名而不保留元信息

# 11.类设计规范

# 11.1 dataclass 与 property

from dataclasses import dataclass

# ✅ 数据容器优先用 dataclass
@dataclass
class User:
    id: int
    name: str
    email: str | None = None

# ✅ 属性用 @property（不暴露内部实现）
class Circle:
    def __init__(self, radius: float) -> None:
        self._radius = radius

    @property
    def radius(self) -> float:
        return self._radius

    @radius.setter
    def radius(self, value: float) -> None:
        if value < 0:
            raise ValueError("radius must be >= 0")
        self._radius = value

# ✅ 静态方法 vs 类方法
class DateUtils:
    @staticmethod
    def is_weekend(date): ...          # 不依赖类

    @classmethod
    def from_string(cls, s: str): ...  # 依赖类（常用作工厂）

# 11.2 `slots` 与抽象基类【推荐】

# ✅ __slots__：大量实例时节省内存（禁止动态添加属性）
class Point3D:
    __slots__ = ('x', 'y', 'z')

    def __init__(self, x: float, y: float, z: float) -> None:
        self.x, self.y, self.z = x, y, z

# 成千上万个 Point3D 对象 → 内存节省 50%+
# 注意：__slots__ 的子类也要显式声明 __slots__

# ✅ 抽象基类（ABC）：定义接口契约
from abc import ABC, abstractmethod

class Storage(ABC):
    @abstractmethod
    def read(self, key: str) -> bytes: ...
    @abstractmethod
    def write(self, key: str, data: bytes) -> None: ...

class FileStorage(Storage):
    def read(self, key: str) -> bytes: ...
    def write(self, key: str, data: bytes) -> None: ...

# Storage() → TypeError，不能实例化抽象类
# FileStorage 没有实现所有抽象方法 → TypeError

# ✅ 数据验证：Pydantic（第三方，但非常推荐）
from pydantic import BaseModel, Field, validator

class UserModel(BaseModel):
    id: int
    name: str = Field(min_length=1, max_length=50)
    email: str | None = None

    @validator('name')
    def name_not_blank(cls, v: str) -> str:
        if not v.strip():
            raise ValueError('name 不能为空')
        return v.strip()

# 12.并发与异步

import asyncio

# ✅ async/await 异步 IO
async def fetch_all(urls: list[str]) -> list[dict]:
    async with aiohttp.ClientSession() as session:
        tasks = [fetch(session, url) for url in urls]
        return await asyncio.gather(*tasks)

# ✅ 线程池执行 CPU 密集型任务
def cpu_intensive(data):
    ...

result = await asyncio.to_thread(cpu_intensive, large_data)

# ✅ 超时控制
async def fetch_with_timeout(url: str, timeout: float = 10):
    async with asyncio.timeout(timeout):
        return await fetch(url)

# 13.性能优化

# ✅ 生成器替代列表（省内存）
# ❌ sum([x*x for x in range(1_000_000)])  # 先在内存中建百万个元素的列表
total = sum(x * x for x in range(1_000_000))  # ✅ 惰性求值

# ✅ 字符串拼接用 join
result = "".join(items)                         # O(n)
# ❌ result = ""
# ❌ for s in items: result += s                # O(n²)

# ✅ 成员检查：set/dict 优于 list
lookup_set = set(large_list)
if target in lookup_set: ...                     # O(1)
# if target in large_list: ...                   # O(n)

# ✅ 局部变量优于全局变量（函数内访问在 LOAD_FAST）
def compute():
    local = heavy_computation()                  # 快
    return local

# ✅ collections.deque 优于 list 做队列（两端操作 O(1)）
from collections import deque
q = deque(); q.append(1); q.popleft()

# ✅ 大数据处理用 numpy / pandas（C 扩展，比纯 Python 快 10-100x）

# 14.常见反模式

反模式	问题	改进
`from module import *`	污染命名空间	显式导入
裸 `except:`	吞掉 KeyboardInterrupt 等	`except Exception:` 或具体异常
循环中拼接字符串	O(n²) 性能	`"".join(list)`
函数参数可变默认值	多次调用共享对象	默认值用 None
`if x == True`	冗余	`if x`
`type(obj) == type(...)`	不认子类	`isinstance(obj, ...)`
`try-except-pass`	吞异常不可排查	至少 log 一下原因

# 15.工具链与自动化

# pyproject.toml
[tool.ruff]
line-length = 120
target-version = "py311"

[tool.ruff.lint]
select = ["E", "F", "I", "N", "W", "UP", "B"]
# E/W: pycodestyle 错误/警告
# F: pyflakes 检测
# I: isort 导入排序
# N: pep8-naming
# UP: pyupgrade 现代语法
# B: flake8-bugbear 潜在 bug

[tool.ruff.format]
quote-style = "double"

[tool.mypy]
python_version = "3.11"
strict = true
warn_unreachable = true

工具	用途
`ruff`	代码格式 + lint（替代 flake8 + isort + black）
`mypy`	类型检查（CI 必须跑）
`pytest`	单元测试
`pre-commit`	Git 提交前自动检查
`bandit`	安全漏洞扫描

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.6.0
    hooks:
      - id: ruff
      - id: ruff-format
  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v1.11.0
    hooks:
      - id: mypy
        additional_dependencies: [types-requests]

pre-commit install          # 每次 git commit 自动跑
pre-commit run --all-files  # 手动全量检查

# 16.代码审查清单

## 命名与格式
- [ ] 变量/函数 snake_case，类 PascalCase，常量 UPPER_CASE
- [ ] 无单字母（循环索引除外）、拼音、含义模糊的命名
- [ ] 4 空格缩进，无 Tab，单行 ≤ 120 字符
- [ ] import 顺序：标准库 → 第三方 → 本地

## 类型与文档
- [ ] 公开函数有类型注解（参数和返回值）
- [ ] Python 3.10+ 用内置泛型（`list[str]` 而非 `List[str]`）
- [ ] 公共模块/类/函数有 docstring

## 逻辑与健壮性
- [ ] 与 None 比较用 `is` / `is not`
- [ ] 空序列用 `if not items` 而非 `len() == 0`
- [ ] 异常捕获具体类型，无裸 `except:`，无 `except: pass`
- [ ] 函数默认参数不用可变对象（list/dict 等）
- [ ] 文件/网络/锁用 `with` 语句

## 设计
- [ ] 数据类用 `@dataclass`，而非手写 `__init__`
- [ ] 大量实例的类加 `__slots__`
- [ ] 装饰器用 `@functools.wraps` 保留元信息
- [ ] 正则提前编译为 `re.compile()`

## 性能
- [ ] 大数据用生成器表达式，不创建中间列表
- [ ] 字符串拼接用 `"".join()`，循环中用 `+=`
- [ ] 成员检查用 set/dict（O(1)）而非 list（O(n)）

# 17.常见陷阱速查

# 17.1 可变默认参数

#	陷阱	正解
1	`def f(items=[]):` → 多次调用共享同一 list	`def f(items=None): items = items or []`
2	`def f(cache={}):` → 同上	`def f(cache=None): cache = cache or {}`
3	类属性 `items: list = []` → 所有实例共享	在 `__init__` 中 `self.items = []`

# 17.2 循环与闭包陷阱

#	陷阱	正解
1	`lambda i: i` 在循环中取到最后的值	`lambda i=i: i`（默认参数在定义时绑定）
2	遍历 list 时修改 list → 漏元素	遍历副本 `for x in list[:]:` 或用列表推导
3	`is` 比较数字 → 小整数缓存外的不可靠	数值比较用 `==`，仅 None/bool 用 `is`

# 17.3 性能陷阱

#	陷阱	正解
1	循环中 `+` 拼字符串 → O(n²)	`"".join(list)` 或 `io.StringIO`
2	`import x` 在函数循环内	模块级 import，Python 缓存 import 结果
3	循环中 `in list` 查找 → O(n) 每次	转 set 再用 `in`，O(1)
4	GIL 限制下用多线程做 CPU 计算	CPU 密集用 `multiprocessing`，IO 密集用 `asyncio`

#Python #代码规范 #PEP8

上次更新: 2026/07/14, 09:42:39

← TypeScript编程代码规范指南 Go编程代码规范指南→

Python编程代码规范指南

# Python编程代码规范指南

# 目录

# 01.规范概述

# 1.1 为何需要 Python 规范

# 1.2 核心目标

# 02.命名规范

# 2.1 命名总表

# 2.2 命名反模式

# 03.代码格式

# 3.1 缩进与空格

# 3.2 import 顺序

# 3.3 空行与换行

# 04.类型注解

# 4.1 基础类型注解

# 4.2 TypedDict 与 Protocol 【推荐】

# 4.3 函数重载注解（@overload）【推荐】

# 05.字符串与文档

# 06.比较与条件

# 07.异常处理

# 08.列表与字典推导

# 09.上下文管理器

# 10.函数规范

# 10.1 参数与返回值

# 10.2 装饰器 【推荐】

# 11.类设计规范

# 11.1 dataclass 与 property

# 11.2 __slots__ 与抽象基类 【推荐】

# 12.并发与异步

# 13.性能优化

# 14.常见反模式

# 15.工具链与自动化

# 16.代码审查清单

# 17.常见陷阱速查

# 17.1 可变默认参数

# 17.2 循环与闭包陷阱

# 17.3 性能陷阱

# 4.3 函数重载注解（`@overload`）【推荐】

# 10.2 装饰器【推荐】

# 11.2 `slots` 与抽象基类【推荐】