mirror of
https://github.com/suyiiyii/nonebot-bison.git
synced 2026-05-09 18:27:56 +08:00
🐛 修复bilibili推送的一些格式错误 (#263)
* 🎈 perf(platform/bilibili): 增加了相似度计算前文本的预处理 将动态和简介文本中较长的一段按照较短的一段进行截取(分了从前截和从后截的两种情况) * 🐞 fix(bilibili): 修复视频简介多余空格的bug * 🦄 refactor(bilibili): 更改文本相似度比较函数
This commit is contained in:
@@ -101,17 +101,6 @@ if plugin_config.bison_filter_log:
|
||||
)
|
||||
|
||||
|
||||
def jaccard_text_similarity(str1: str, str2: str) -> float:
|
||||
"""
|
||||
计算两个字符串(基于字符)的
|
||||
[Jaccard相似系数](https://zh.wikipedia.org/wiki/雅卡尔指数)
|
||||
是否达到阈值
|
||||
"""
|
||||
set1 = set(str1)
|
||||
set2 = set(str2)
|
||||
return len(set1 & set2) / len(set1 | set2)
|
||||
|
||||
|
||||
def text_similarity(str1, str2) -> float:
|
||||
matcher = difflib.SequenceMatcher(None, str1, str2)
|
||||
t = sum(temp.size for temp in matcher.get_matching_blocks())
|
||||
|
||||
Reference in New Issue
Block a user