4 Commits

Author SHA1 Message Date
suyiiyii 05e8d3ba37 🐛 移除没有必要的命令 2024-09-01 10:26:18 +08:00
suyiiyii adf14840df 🐛 优化无权限提示的排版 2024-08-21 18:52:27 +08:00
suyiiyii bbe5a22479 添加no_permission_matcher相关的单元测试 2024-08-21 01:21:13 +08:00
suyiiyii 85cc112599 无权限用户尝试添加订阅时返回提示信息 2024-08-21 00:22:20 +08:00
24 changed files with 8487 additions and 11280 deletions
+2 -2
View File
@@ -7,7 +7,7 @@ ci:
autoupdate_commit_msg: ":arrow_up: auto update by pre-commit hooks"
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.6.3
rev: v0.6.0
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
@@ -34,7 +34,7 @@ repos:
stages: [commit]
- repo: https://github.com/pre-commit/mirrors-eslint
rev: v9.9.1
rev: v9.8.0
hooks:
- id: eslint
additional_dependencies:
+1 -6
View File
@@ -1,14 +1,9 @@
# Change Log
## v0.9.4
## 最近更新
### Bug 修复
- FSM 内部执行外部函数出现异常时不应崩溃 [@AzideCupric](https://github.com/AzideCupric) ([#616](https://github.com/MountainDash/nonebot-bison/pull/616))
- 无权限用户尝试添加订阅时返回提示信息 [@suyiiyii](https://github.com/suyiiyii) ([#617](https://github.com/MountainDash/nonebot-bison/pull/617))
- B站请求策略阶段行为优化 [@AzideCupric](https://github.com/AzideCupric) ([#610](https://github.com/MountainDash/nonebot-bison/pull/610))
- Rss 不再删除格式化字符 [@suyiiyii](https://github.com/suyiiyii) ([#615](https://github.com/MountainDash/nonebot-bison/pull/615))
- forbid adding platform that needs browser in no-browser env [@felinae98](https://github.com/felinae98) ([#609](https://github.com/MountainDash/nonebot-bison/pull/609))
- 修正项目的代码警告 [@AzideCupric](https://github.com/AzideCupric) ([#614](https://github.com/MountainDash/nonebot-bison/pull/614))
- 修复 anonymous_site() 无法正确工作的问题 [@felinae98](https://github.com/felinae98) ([#606](https://github.com/MountainDash/nonebot-bison/pull/606))
+18 -18
View File
@@ -5,26 +5,26 @@
"homepage": "bison",
"proxy": "http://127.0.0.1:8080",
"dependencies": {
"@arco-design/web-react": "^2.64.0",
"@babel/core": "^7.25.2",
"@arco-design/web-react": "^2.63.1",
"@babel/core": "^7.24.7",
"@babel/plugin-syntax-flow": "^7.24.7",
"@babel/plugin-transform-react-jsx": "^7.25.2",
"@babel/plugin-transform-react-jsx": "^7.24.7",
"@reduxjs/toolkit": "^1.9.7",
"@testing-library/dom": "^10.4.0",
"@testing-library/react": "^16.0.1",
"@testing-library/react": "^16.0.0",
"@testing-library/user-event": "^14.5.2",
"@types/jest": "^29.5.13",
"@types/node": "^20.16.5",
"@types/react": "^18.3.7",
"@types/jest": "^29.5.12",
"@types/node": "^20.14.10",
"@types/react": "^18.3.3",
"@types/react-dom": "^18.3.0",
"react": "^18.3.1",
"react-dom": "^18.3.1",
"react-redux": "^9.1.2",
"react-router-dom": "^6.26.2",
"react-router-dom": "^6.24.1",
"react-scripts": "5.0.1",
"redux": "^5.0.1",
"redux-persist": "^6.0.0",
"typescript": "^5.6.2",
"typescript": "^5.5.4",
"web-vitals": "^3.5.2"
},
"scripts": {
@@ -53,17 +53,17 @@
]
},
"devDependencies": {
"@testing-library/jest-dom": "^6.5.0",
"@typescript-eslint/eslint-plugin": "^8.6.0",
"@typescript-eslint/parser": "^8.6.0",
"eslint": "^8.57.1",
"@testing-library/jest-dom": "^6.4.6",
"@typescript-eslint/eslint-plugin": "^8.0.0",
"@typescript-eslint/parser": "^8.0.0",
"eslint": "^9.6.0",
"eslint-config-airbnb": "^19.0.4",
"eslint-config-airbnb-typescript": "^18.0.0",
"eslint-import-resolver-typescript": "^3.6.3",
"eslint-plugin-import": "^2.30.0",
"eslint-plugin-jsx-a11y": "^6.10.0",
"eslint-plugin-react": "^7.36.1",
"eslint-import-resolver-typescript": "^3.6.1",
"eslint-plugin-import": "^2.29.1",
"eslint-plugin-jsx-a11y": "^6.9.0",
"eslint-plugin-react": "^7.34.3",
"eslint-plugin-react-hooks": "^4.6.2",
"eslint-plugin-react-redux": "^4.2.0"
"eslint-plugin-react-redux": "^4.1.0"
}
}
+6071 -8011
View File
File diff suppressed because it is too large Load Diff
+1 -1
View File
@@ -1,4 +1,4 @@
FROM node:20.17.0 as frontend
FROM node:20.15.1 as frontend
ADD . /app
WORKDIR /app/admin-frontend
RUN yarn && yarn build
+1 -1
View File
@@ -1,4 +1,4 @@
# syntax=docker/dockerfile:1.10
# syntax=docker/dockerfile:1.8
FROM python:3.11-slim-bullseye as base
FROM base as builder
+1 -1
View File
@@ -1,4 +1,4 @@
# syntax=docker/dockerfile:1.10
# syntax=docker/dockerfile:1.8
FROM python:3.11-slim-bullseye as base
FROM base as builder
-13
View File
@@ -3,7 +3,6 @@ from pkgutil import iter_modules
from collections import defaultdict
from importlib import import_module
from ..plugin_config import plugin_config
from .platform import Platform, make_no_target_group
_package_dir = str(Path(__file__).resolve().parent)
@@ -23,15 +22,3 @@ for name, platform_list in _platform_list.items():
platform_manager[name] = platform_list[0]
else:
platform_manager[name] = make_no_target_group(platform_list)
def _get_unavailable_platforms() -> dict[str, str]:
res = {}
for name, platform in platform_manager.items():
if platform.site.require_browser and not plugin_config.bison_use_browser:
res[name] = "需要启用 bison_use_browser"
return res
# platform => reason for not available
unavailable_paltforms: dict[str, str] = _get_unavailable_platforms()
+1 -62
View File
@@ -2,24 +2,10 @@ import sys
import asyncio
import inspect
from enum import Enum
from functools import wraps
from dataclasses import dataclass
from collections.abc import Set as AbstractSet
from collections.abc import Callable, Sequence, Awaitable, AsyncGenerator
from typing import (
TYPE_CHECKING,
Any,
Generic,
TypeVar,
Protocol,
ParamSpec,
TypeAlias,
TypedDict,
NamedTuple,
Concatenate,
overload,
runtime_checkable,
)
from typing import TYPE_CHECKING, Any, Generic, TypeVar, Protocol, TypeAlias, TypedDict, NamedTuple, runtime_checkable
from nonebot import logger
@@ -31,7 +17,6 @@ TAddon = TypeVar("TAddon", contravariant=True)
TState = TypeVar("TState", contravariant=True)
TEvent = TypeVar("TEvent", contravariant=True)
TFSM = TypeVar("TFSM", bound="FSM", contravariant=True)
P = ParamSpec("P")
class StateError(Exception): ...
@@ -178,52 +163,6 @@ class FSM(Generic[TState, TEvent, TAddon]):
self.started = False
del self.machine
self.current_state = self.graph["initial"]
self.machine = self._core()
logger.trace("FSM closed")
@overload
def reset_on_exception(
func: Callable[Concatenate[TFSM, P], Awaitable[ActionReturn]],
) -> Callable[Concatenate[TFSM, P], Awaitable[ActionReturn]]:
"""自动在发生异常后重置 FSM"""
@overload
def reset_on_exception(
auto_start: bool = False,
) -> Callable[
[Callable[Concatenate[TFSM, P], Awaitable[ActionReturn]]], Callable[Concatenate[TFSM, P], Awaitable[ActionReturn]]
]:
"""自动在异常后重置 FSM,当 auto_start 为 True 时,自动启动 FSM"""
# 参考自 dataclasses.dataclass 的实现
def reset_on_exception(func=None, /, *, auto_start=False): # pyright: ignore[reportInconsistentOverload]
def warp(func: Callable[Concatenate[TFSM, P], Awaitable[ActionReturn]]):
return __reset_clear_up(func, auto_start)
# 判断调用的是 @reset_on_exception 还是 @reset_on_exception(...)
if func is None:
# 调用的是带括号的
return warp
# 调用的是不带括号的
return warp(func)
def __reset_clear_up(func: Callable[Concatenate[TFSM, P], Awaitable[ActionReturn]], auto_start: bool):
@wraps(func)
async def wrapper(fsm_self: TFSM, *args: P.args, **kwargs: P.kwargs) -> ActionReturn:
try:
return await func(fsm_self, *args, **kwargs)
except Exception as e:
logger.error(f"Exception in {func.__name__}: {e}")
await fsm_self.reset()
if auto_start and not fsm_self.started:
await fsm_self.start()
raise e
return wrapper
+4 -13
View File
@@ -14,7 +14,7 @@ from httpx import URL as HttpxURL
from nonebot_bison.types import Target
from .models import DynRawPost
from .fsm import FSM, Condition, StateGraph, Transition, ActionReturn, reset_on_exception
from .fsm import FSM, Condition, StateGraph, Transition, ActionReturn
if TYPE_CHECKING:
from .platforms import Bilibili
@@ -218,11 +218,6 @@ class RetryFSM(FSM[RetryState, RetryEvent, RetryAddon[TBilibili]]):
self.addon.reset_all()
await super().reset()
@override
@reset_on_exception
async def emit(self, event: RetryEvent):
await super().emit(event)
# FIXME: 拿出来是方便测试了,但全局单例会导致所有被装饰的函数共享状态,有待改进
_retry_fsm = RetryFSM(RETRY_GRAPH, RetryAddon["Bilibili"]())
@@ -241,19 +236,15 @@ def retry_for_352(api_func: Callable[[TBilibili, Target], Awaitable[list[DynRawP
case RetryState.NROMAL | RetryState.REFRESH | RetryState.RAISE:
try:
res = await api_func(bls, *args, **kwargs)
except ApiCode352Error as e:
logger.warning("本次 Bilibili API 请求返回 352 错误")
except ApiCode352Error:
logger.error("API 352 错误")
await _retry_fsm.emit(RetryEvent.REQUEST_AND_RAISE)
if _retry_fsm.current_state == RetryState.RAISE:
raise e
return []
else:
await _retry_fsm.emit(RetryEvent.REQUEST_AND_SUCCESS)
return res
case RetryState.BACKOFF:
logger.warning("本次 Bilibili 请求回避中,不请求")
logger.warning("回避中,不请求")
await _retry_fsm.emit(RetryEvent.IN_BACKOFF_TIME)
return []
case _:
+1 -1
View File
@@ -68,7 +68,7 @@ class BilibiliClientManager(ClientManager):
class BilibiliSite(Site):
name = "bilibili.com"
schedule_setting = {"seconds": 60}
schedule_setting = {"seconds": 50}
schedule_type = "interval"
client_mgr = BilibiliClientManager
require_browser = True
+3 -3
View File
@@ -9,7 +9,7 @@ from bs4 import BeautifulSoup as bs
from ..post import Post
from .platform import NewMessage
from ..types import Target, RawPost
from ..utils import Site, text_similarity
from ..utils import Site, text_fletten, text_similarity
class RssSite(Site):
@@ -32,7 +32,7 @@ class RssPost(Post):
for p in soup.find_all("p"):
p.insert_after("\n")
return soup.get_text()
return text_fletten(soup.get_text())
class Rss(NewMessage):
@@ -82,7 +82,7 @@ class Rss(NewMessage):
async def parse(self, raw_post: RawPost) -> Post:
title = raw_post.get("title", "")
soup = bs(raw_post.description, "html.parser")
desc = raw_post.description
desc = soup.text.strip()
title, desc = self._text_process(title, desc)
pics = [x.attrs["src"] for x in soup("img")]
if raw_post.get("media_content"):
+1 -3
View File
@@ -9,9 +9,9 @@ from nonebot_plugin_saa import Text, PlatformTarget, SupportedAdapters
from ..types import Target
from ..config import config
from ..apis import check_sub_target
from ..platform import Platform, platform_manager
from ..config.db_config import SubscribeDupException
from .utils import common_platform, ensure_user_info, gen_handle_cancel
from ..platform import Platform, platform_manager, unavailable_paltforms
def do_add_sub(add_sub: type[Matcher]):
@@ -39,8 +39,6 @@ def do_add_sub(add_sub: type[Matcher]):
elif platform == "取消":
await add_sub.finish("已中止订阅")
elif platform in platform_manager:
if platform in unavailable_paltforms:
await add_sub.finish(f"无法订阅 {platform}{unavailable_paltforms[platform]}")
state["platform"] = platform
else:
await add_sub.reject("平台输入错误")
+4 -4
View File
@@ -11,9 +11,9 @@
"docs:update-package": "pnpm dlx vp-update"
},
"devDependencies": {
"@vuepress/bundler-vite": "2.0.0-rc.15",
"vue": "^3.5.6",
"vuepress": "2.0.0-rc.15",
"vuepress-theme-hope": "2.0.0-rc.52"
"@vuepress/bundler-vite": "2.0.0-rc.14",
"vue": "^3.4.31",
"vuepress": "2.0.0-rc.14",
"vuepress-theme-hope": "2.0.0-rc.50"
}
}
+1771 -2121
View File
File diff suppressed because it is too large Load Diff
Generated
+519 -536
View File
File diff suppressed because it is too large Load Diff
+24 -24
View File
@@ -1,6 +1,6 @@
[tool.poetry]
name = "nonebot-bison"
version = "0.9.4"
version = "0.9.3"
description = "Subscribe message from social medias"
authors = ["felinae98 <felinae225@qq.com>"]
license = "MIT"
@@ -24,40 +24,40 @@ classifiers = [
python = ">=3.10,<4.0.0"
beautifulsoup4 = ">=4.12.3"
feedparser = "^6.0.11"
httpx = ">=0.27.2"
nonebot2 = { extras = ["fastapi"], version = "^2.3.3" }
nonebot-adapter-onebot = "^2.4.5"
nonebot-plugin-htmlrender = ">=0.3.5"
httpx = ">=0.27.0"
nonebot2 = { extras = ["fastapi"], version = "^2.3.2" }
nonebot-adapter-onebot = "^2.4.4"
nonebot-plugin-htmlrender = ">=0.3.3"
nonebot-plugin-datastore = ">=1.3.0,<2.0.0"
nonebot-plugin-apscheduler = ">=0.5.0"
nonebot-plugin-send-anything-anywhere = ">=0.7.1,<0.7.2"
pillow = ">=10.4.0,<11.0"
pyjwt = "^2.9.0"
python-socketio = "^5.11.4"
nonebot-plugin-send-anything-anywhere = ">=0.6.1,<0.7.0"
pillow = ">=8.4.0,<11.0"
pyjwt = "^2.8.0"
python-socketio = "^5.11.3"
tinydb = "^4.8.0"
qrcode = "^7.4.2"
pydantic = ">=2.9.2,<3.0.0,!=2.5.0,!=2.5.1"
lxml = ">=5.3.0"
yarl = ">=1.11.1"
hishel = "^0.0.30"
expiringdictx = "^1.1.0"
rapidfuzz = "^3.9.7"
pydantic = ">=1.10.17,<3.0.0,!=2.5.0,!=2.5.1"
lxml = ">=5.2.2"
yarl = ">=1.9.4"
hishel = "^0.0.20"
expiringdictx = "^1.0.1"
rapidfuzz = "^3.9.3"
[tool.poetry.group.dev.dependencies]
black = ">=24.8.0,<25.0"
black = ">=23.12.1,<25.0"
ipdb = "^0.13.13"
isort = "^5.13.2"
nonemoji = "^0.1.4"
nb-cli = "^1.4.2"
pre-commit = "^3.8.0"
ruff = "^0.6.5"
nb-cli = "^1.4.1"
pre-commit = "^3.7.1"
ruff = "^0.6.0"
[tool.poetry.group.test.dependencies]
flaky = "^3.8.1"
nonebug = "^0.3.7"
nonebug-saa = "^0.4.1"
pytest = ">=8.3.3,<9.0.0"
pytest-asyncio = ">=0.24.0,<0.24.1"
pytest = ">=7.4.4,<9.0.0"
pytest-asyncio = ">=0.23.7,<0.24.0"
pytest-cov = ">=5.0.0,<6"
pytest-mock = "^3.14.0"
pytest-xdist = { extras = ["psutil"], version = "^3.6.1" }
@@ -68,10 +68,10 @@ freezegun = "^1.5.1"
optional = true
[tool.poetry.group.docker.dependencies]
nb-cli = "^1.4.2"
nonebot2 = { extras = ["fastapi", "aiohttp"], version = "^2.3.3" }
nb-cli = "^1.4.1"
nonebot2 = { extras = ["fastapi", "aiohttp"], version = "^2.3.2" }
nonebot-adapter-red = "^0.9.0"
nonebot-adapter-qq = "^1.5.1"
nonebot-adapter-qq = "^1.4.4"
poetry-core = "^1.9.0"
[tool.poetry.extras]
-10
View File
@@ -18,7 +18,6 @@ def pytest_configure(config: pytest.Config) -> None:
"superusers": {"10001"},
"command_start": {""},
"log_level": "TRACE",
"bison_use_browser": True,
}
@@ -114,12 +113,3 @@ async def use_legacy_config(app: App):
# 清除单例的缓存
Singleton._instances.clear()
@pytest.fixture
async def _no_browser(app: App, mocker: MockerFixture):
from nonebot_bison.plugin_config import plugin_config
from nonebot_bison.platform import _get_unavailable_platforms
mocker.patch.object(plugin_config, "bison_use_browser", False)
mocker.patch("nonebot_bison.platform.unavailable_paltforms", _get_unavailable_platforms())
+1 -94
View File
@@ -58,92 +58,6 @@ def without_dynamic(app: App):
)
@pytest.mark.asyncio
async def test_reset_on_exception(app: App):
from strenum import StrEnum
from nonebot_bison.platform.bilibili.fsm import FSM, StateGraph, Transition, ActionReturn, reset_on_exception
class State(StrEnum):
A = "A"
B = "B"
C = "C"
class Event(StrEnum):
A = "A"
B = "B"
C = "C"
class Addon:
pass
async def raction(from_: State, event: Event, to: State, addon: Addon) -> ActionReturn:
logger.info(f"action: {from_} -> {to}")
raise RuntimeError("test")
async def action(from_: State, event: Event, to: State, addon: Addon) -> ActionReturn:
logger.info(f"action: {from_} -> {to}")
graph: StateGraph[State, Event, Addon] = {
"transitions": {
State.A: {
Event.A: Transition(raction, State.B),
Event.B: Transition(action, State.C),
},
State.B: {
Event.B: Transition(action, State.C),
},
State.C: {
Event.C: Transition(action, State.A),
},
},
"initial": State.A,
}
addon = Addon()
class AFSM(FSM[State, Event, Addon]):
@reset_on_exception(auto_start=True)
async def emit(self, event: Event):
return await super().emit(event)
fsm = AFSM(graph, addon)
await fsm.start()
with pytest.raises(RuntimeError):
await fsm.emit(Event.A)
assert fsm.started is True
await fsm.emit(Event.B)
await fsm.emit(Event.C)
class BFSM(FSM[State, Event, Addon]):
@reset_on_exception
async def emit(self, event: Event):
return await super().emit(event)
fsm = BFSM(graph, addon)
await fsm.start()
with pytest.raises(RuntimeError):
await fsm.emit(Event.A)
assert fsm.started is False
with pytest.raises(TypeError, match="can't send non-None value to a just-started async generator"):
await fsm.emit(Event.B)
class CFSM(FSM[State, Event, Addon]): ...
fsm = CFSM(graph, addon)
await fsm.start()
with pytest.raises(RuntimeError):
await fsm.emit(Event.A)
assert fsm.started is True
with pytest.raises(StopAsyncIteration):
await fsm.emit(Event.B)
@pytest.mark.asyncio
async def test_retry_for_352(app: App, mocker: MockerFixture):
from nonebot_bison.post import Post
@@ -269,7 +183,7 @@ async def test_retry_for_352(app: App, mocker: MockerFixture):
fakebili.set_raise352(True)
for state in test_state_list[:-3]:
for state in test_state_list:
logger.info(f"\n\nnow state should be {state}")
assert _retry_fsm.current_state == state
@@ -280,13 +194,6 @@ async def test_retry_for_352(app: App, mocker: MockerFixture):
if state == RetryState.BACKOFF:
freeze_start += timedelta_length * (_retry_fsm.addon.backoff_count + 1) ** 2
for state in test_state_list[-3:]:
logger.info(f"\n\nnow state should be {state}")
assert _retry_fsm.current_state == state
with pytest.raises(ApiCode352Error):
await fakebili.get_sub_list(Target("t1")) # type: ignore
assert client_mgr.refresh_client_call_count == 4 * 3 + 3 # refresh + raise
assert client_mgr.get_client_call_count == 2 + 4 * 3 + 3 # previous + refresh + raise
+4 -18
View File
@@ -88,21 +88,9 @@ async def test_fetch_new_1(
assert post1.title is None
assert (
post1.content
== "【#統合戦略】 <br />引き続き新テーマ「ミヅキと紺碧の樹」の新要素及びシステムの変更点を一部ご紹介します! "
"<br /><br />"
"今回は「灯火」、「ダイス」、「記号認識」、「鍵」についてです。<br />詳細は添付の画像をご確認ください。"
"<br /><br />"
"#アークナイツ https://t.co/ARmptV0Zvu<br />"
'<img src="https://pbs.twimg.com/media/FwZG9YAacAIXDw2?format=jpg&amp;name=orig" />'
)
plain_content = await post1.get_plain_content()
assert (
plain_content == "【#統合戦略】 \n"
"引き続き新テーマ「ミヅキと紺碧の樹」の新要素及びシステムの変更点を一部ご紹介します! \n\n"
"今回は「灯火」、「ダイス」、「記号認識」、「鍵」についてです。\n"
"詳細は添付の画像をご確認ください。\n\n"
"#アークナイツ https://t.co/ARmptV0Zvu\n"
"[图片]"
== "【#統合戦略】 引き続き新テーマ「ミヅキと紺碧の樹」の新要素及びシステムの変更点を一部ご紹介します!"
" 今回は「灯火」、「ダイス」、「記号認識」、「鍵」についてです。詳細は添付の画像をご確認ください。"
"#アークナイツ https://t.co/ARmptV0Zvu"
)
@@ -186,9 +174,7 @@ async def test_fetch_new_4(
assert len(res2[0][1]) == 1
post1 = res2[0][1][0]
assert post1.url == "https://wallhaven.cc/w/85rjej"
assert post1.content == '<img alt="loading" class="lazyload" src="https://th.wallhaven.cc/small/85/85rjej.jpg" />'
plain_content = await post1.get_plain_content()
assert plain_content == "[图片]"
assert post1.content == "85rjej.jpg"
def test_similar_text_process():
-45
View File
@@ -615,48 +615,3 @@ async def test_add_with_bilibili_bangumi_target_parser(app: App, init_scheduler)
assert sub.tags == []
assert sub.target.platform_name == "bilibili-bangumi"
assert sub.target.target_name == "汉化日记 第三季"
@pytest.mark.asyncio
async def test_subscribe_platform_requires_browser(app: App, mocker: MockerFixture):
from nonebot.adapters.onebot.v11.event import Sender
from nonebot.adapters.onebot.v11.message import Message
from nonebot_bison.plugin_config import plugin_config
from nonebot_bison.sub_manager import add_sub_matcher, common_platform
from nonebot_bison.platform import platform_manager, unavailable_paltforms
mocker.patch.object(plugin_config, "bison_use_browser", False)
mocker.patch.dict(unavailable_paltforms, {"bilibili": "需要启用 bison_use_browser"})
async with app.test_matcher(add_sub_matcher) as ctx:
bot = ctx.create_bot()
event_1 = fake_group_message_event(
message=Message("添加订阅"),
sender=Sender(card="", nickname="test", role="admin"),
to_me=True,
)
ctx.receive_event(bot, event_1)
ctx.should_pass_rule()
ctx.should_call_send(
event_1,
BotReply.add_reply_on_platform(platform_manager=platform_manager, common_platform=common_platform),
True,
)
event_2 = fake_group_message_event(
message=Message("全部"), sender=Sender(card="", nickname="test", role="admin")
)
ctx.receive_event(bot, event_2)
ctx.should_rejected()
ctx.should_call_send(
event_2,
BotReply.add_reply_on_platform_input_allplatform(platform_manager),
True,
)
event_3 = fake_group_message_event(message=Message("bilibili"), sender=fake_admin_user)
ctx.receive_event(bot, event_3)
ctx.should_call_send(
event_3,
BotReply.add_reply_platform_unavailable("bilibili", "需要启用 bison_use_browser"),
True,
)
-4
View File
@@ -146,10 +146,6 @@ class BotReply:
extra_text = ("1." + target_promot + "\n2.") if target_promot else ""
return extra_text + base_text
@staticmethod
def add_reply_platform_unavailable(platform: str, reason: str) -> str:
return f"无法订阅 {platform}{reason}"
add_reply_on_id_input_error = "id输入错误"
add_reply_on_target_parse_input_error = "不能从你的输入中提取出id,请检查你输入的内容是否符合预期"
add_reply_on_platform_input_error = "平台输入错误"
Binary file not shown.
-230
View File
@@ -1,230 +0,0 @@
<br><br><br>
<center><p style="font-size:56px"><b>结项报告</b></p></center>
<br>
<center>项目名称:<u>bison 爬虫的 Cookie 管理与调度系统</u></center>
<center>项目主导师:<u><a href="mailto:felinae225@qq.com ">felinae98</a></u></center>
<center>报告人:<u>杜家楷</u></center>
<center>日期:<u>2024.09.30</u></center>
<center>邮箱:<u>suyiiyii@gmail.com</u></center>
[toc]
# 项目背景
Bison 是一个支持从各个站点和社交平台获取信息,并推送到 QQ 的 NoneBot 插件。但是随着从各个网站获取信息难得的提升,Bison 需要支持携带 Cookie 进行请求。
本项目旨在为 Bison 添加 Cookie 功能的支持。完成一个通用的 Cookie 组件,为各个平台的信息采集提供支持并保证扩展能力,需要同时支持实名 Cookie 和匿名 Cookie 的调度;同时要实现完善的 UI 供管理员进行管理。
# 方案描述:
继承原有的 ClientManager,创建 CookieClientManger,在获取 client 时,会根据 Target 信息自动选择合适的 Cookie。
## cookie 存储
因为 Cookie 和订阅的 Target 之间是多对多关系,所以创建两张表,一张 Cookie 表用于存储 Cookie 的内容和状态等信息,另一张是 CookieTarget 表,用于存储 Cookie 和 target 直接的关系。
```mermaid
classDiagram
class Cookie {
int id
str site_name
str content
str cookie_name
datetime last_usage
str status
int cd_milliseconds
bool is_universal
bool is_anonymous
dict[str, Any] tags
}
class CookieTarget {
int id
int target_id
int cookie_id
}
```
## 获取 Cookie
```python
async def get_client(self, target: Target | None) -> AsyncClient: ...
async def get_client_for_static(self) -> AsyncClient: ...
async def get_query_name_client(self) -> AsyncClient: ...
async def refresh_client(self): ...
```
现有的 ClientManager 有以上方法,Platform 模块抓取时,用的是 get_client 方法,获取到 AsyncClient,再使用获取到的 AsyncClient 进行请求。所以,只需要重写 get_client 方法,根据传入的 Target 信息返回带有 Cookie 的 client,即可实现携带 Cookie 请求。
## 调度 Cookie
首先,cookie 分为实名 Cookie 和匿名 Cookie。实名 Cookie 为用户上传的 cookie,匿名 Cookie 为程序可以自动生成的 cookie。
同时,项目内还存在没有 target 概念的 Platform,还需要兼容这种情况。
为了调度 Cookie,添加了`status``last_usage``cd``is_anonymous` 等字段,具体定义和含义见下:
```python
# 最后使用的时刻
last_usage: Mapped[datetime.datetime] = mapped_column(DateTime, default=datetime.datetime(1970, 1, 1))
# Cookie 当前的状态
status: Mapped[str] = mapped_column(String(20), default="")
# 使用一次之后,需要的冷却时间
cd_milliseconds: Mapped[int] = mapped_column(default=0)
# 是否是通用 Cookie(对所有 Target 都有效)
is_universal: Mapped[bool] = mapped_column(default=False)
# 是否是匿名 Cookie
is_anonymous: Mapped[bool] = mapped_column(default=False)
# 标签,扩展用
tags: Mapped[dict[str, Any]] = mapped_column(JSON().with_variant(JSONB, "postgresql"), default={})
```
其中:
- **is_universal**:用于标记 Cookie 是否为通用 Cookie,即对所有 Target 都有效。可以理解为是一种特殊的 target,添加 Cookie 和获取 Cookie 时通过传入参数进行设置。
- **is_anonymous**:用于标记 Cookie 是否为匿名 Cookie,目前的定义是:可以由程序自动生成的,适用于所有 target 的 Cookie。目前的逻辑是 bison 启动时,删除原有的匿名 cookie,再生成一个新的匿名 cookie。
- **无 Target 平台的 Cookie 处理方式**
对于不存在 Target 的平台,如小刻食堂,可以重写 init_cookie 方法,为用户 Cookie 设置 is_universal 属性。这样,在获取 Client 时,由于传入的 Target 为空,就只会选择 is_universal 的 cookie。实现了无 Target 平台的用户 Cookie 调度。
## 选择 cookie
### 一种基于优先队列的 Cookie 选择算法
只是简单的选择。
设定:
- Cookie 的「空闲时间」定义为从上次被选择到现在,经过的时间
- 每一个 Cookie 有一个独立的 CD,每次使用之后必须间隔一定时间后才能够再次使用
- 匿名 Cookie 作为保底,设置比实名 Cookie 短的 CD
实现思路
- 在每次 Cookie 被选择时,记录此时的时间
- 每次选择时,选择空闲时间最长的 Cookie,并检查是否过了 CD,如果还在冷却,则选择下一个 Cookie,否则选择该 Cookie
- 如果没有可用的 Cookie,则跳过本次选择
# 时间规划:
## 调研和熟悉阶段(07 月 01 日 - 08 月 01 日)
- [x] 调研主流平台 Cookie 使用情况
- [x] 详细阅读代码,跟踪调试,熟悉项目细节
- [x] 整理开发方案,提交社区讨论
## 开发阶段(08 月 01 日 - 09 月 01 日)
- [x] 和社区共同讨论,确定开发方案
- [x] 编写核心功能模块
- [x] 添加相关组件的单元测试
## 整理和收尾阶段(09 月 01 日 - 09 月 30 日)
- [x] 和社区一起验收核心功能模块
- [x] 完善相关文档
- [x] 思考可以改进或者补充的地方
# 效果展示
## 使用对话添加 Cookie
<img src="assets/image-20240928161112657.png" alt="image-20240928161112657" style="zoom:50%;" />
## 使用对话关联 Cookie 到 Target
<img src="assets/image-20240928161504145.png" alt="image-20240928161504145" style="zoom:50%;" />
在此之后,Bison 会携带用户上传的 Cookie 进行请求,可以抓取到受限制的内容,比如说仅粉丝可见的消息
<img src="assets/image-20240928171940890.png" alt="image-20240928171940890" style="zoom:50%;" />
## 删除 Cookie
<img src="assets/image-20240928172609762.png" alt="image-20240928172609762" style="zoom:50%;" />
## Web UI
在 Bison 原有 Web UI 的基础上,添加了管理 Cookie 的功能。
<img src="assets/image-20240929160056550.png" alt="image-20240929160056550" style="zoom:50%;" />
# 项目难点
我认为项目的难点主要在 Cookie 的调度上。
## 调度基础
因为项目支持多个用户同时使用,每个用户都支持订阅不同的平台,项目对订阅的管理方式并不是简单的租户之间隔离,而是共用一套调度器。例如,如果用户A和用户B同时订阅了一个Target ,此Target并不会重复采集,而是采集之后同时发送给用户A和用户B。因此,在引入Cookie之后,如果只有一个用户上传了Cookie,那么使用该Cookie采集到的信息,是否要转发给另一个用户?
经过和社区讨论,答案是:要转发。在此基础上我的解决方案是,全局仍然共用调度器,采取原有的调度逻辑。但是实际请求前,获取AsyncClient时,会根据请求的Target选择合适的Cookie。
同时,一个实名Cookie可以访问多个受限的资源(例如一个账号可以关注多个微博用户),一个受限的资源也可以通过多个实名Cookie去访问。因此需要合理的对Cookie进行建模。
我抽象了Cookie和CookieTarget两种对象,前者存储Cookie数据,后者存储Cookie和Target的关系,来处理Cookie和Target之间的多对多关系。
## 实名Cookie和匿名Cookie
实名Cookie指的是用户上传的Cookie,匿名Cookie指的是程序能够自己生成的Cookie。Bison 原来只有匿名Cookie,当匿名Cookie失效之后,会尝试重新生成Cookie,基本无额外成本。而如果使用实名Cookie,频繁请求可能会使Cookie关联的用户有安全风险,因此需要指定合理的请求策略。
我将实名Cookie和匿名Cookie进行统一管理,使用同样的调度逻辑进行处理,但是匿名Cookie将会有与实名Cookie不同的参数。让两者在同一个框架下实现了统一调度。
## 无目标概念的平台
Bison还支持采集公告、博客类的,没有「目标」概念的站点(有目标概念的站点有微博,B站等)。对于此类平台,添加Cookie之后,无需关联到Target,也可以说只有该平台一个Target。
对此,我给Cookie添加了`is_universal`属性,用来表示该Cookie是否适用于平台的所有Target,调度时默认在选择范围内。也为匿名Cookie的实现提供支持。
## Cookie 调度
为了实现在安全范围内尽可能快速采集的要求,我设计了默认的调度策略。
- 选择所有匹配的Cookie(包括有关联的和`is_universal`的)
- 在匹配的Cookie集合中,选择满足`last_usage + cd < now()``last_usage`值最小的Cookie
这样做,可以以一种简单的方式实现对实名Cookie的选择,同时提供匿名Cookie为备用方案。
同时,在编写Cookie模块时,我把调度Cookie中的各个阶段都使用函数抽象出来,如选择Cookie、组装Client、状态回写等,便于适配平台的个性化需求。
# 项目总结
- 已完成工作:
- 完成 CookieClientManger
- 创建存储 Cookie 的数据表,支持 CookieClientManger 选择 Cookie。
- 为管理员管理 Cookie 创建对话交互
- 在原有的 WebUI 上添加管理 Cookie 功能
- 导入导出功能支持 Cookie
- 测试用例:
按照项目开发规范编写单元测试,覆盖率不下降,不低于 85%。
- 后续工作安排:
目前,项目的功能已经开发完毕,已提交 PR。但由于个人和社区的时间分配问题,PR 的 review 工作还在进行中,所以接下来的时间将会和社区成员一起修改 PR,达到社区的要求后合并。
# 心得体会
起初,我是在学校里师兄的推荐下,了解到开源之夏活动。我自己很早就了解到开源,也想要参与开源,但是一直没有好的途径,非常感谢活动的组织方提供了这样一个机会让我接触开源,参与开源。
我们的项目主要编程语言是 Python,在以往,我只会用 Python 写一些自用的脚本,最多就是一下简单的 REST API 后端,都是大家口中的「玩具」。而参与此次项目,才让我了解到一个真正有产品,有用户的开源项目是怎么运行的。Release、Issue、PR&Review,还有大家在一起讨论交流,这都是我做自己的项目体验不到的。
同时,参与此次项目也极大的提高了我的代码能力,`装饰器``元类``类型参数`等高级用法,都是我之前没有接触过或接触过但仅限于使用的。还有单元测试,单元测试我一直想做,但是总是遇到些解决不了的问题就放弃了,项目中大量的单元测试,我可以照着已有的去写我自己的,在这个过程中我学到了很多。还有,项目惜字如金的码风,也极大的提高了我阅读缺少注释的代码的能力。
这在里,要感谢社区的成员们,愿意回答我这个新人的各种问题,给我的代码和报告提建议,群友直接还有时不时的互动,给我家的感觉。
还要特别感谢我的导师(felinae98),像一个家长一样,~~及时的~~详尽的 review 我的代码,通过引导让我自己意识到我的方案存在什么问题,并提供改进方案。对于我和社区提出的一些自己觉得很合理的方案,详细的给我们分析项目的情况和引入之后导致的复杂度,以及引入的必要性。我在后来更加深刻的理解项目之后才意识到,当时提出的方案是一个过度设计,也更加认可导师的观点。
非常高兴能够参与开源之夏活动。未来,我将继续热衷于开源事业,积极参与其中。