网关自动注册整改方案: 步骤3增强为心跳+自动重注册(连续3次失败触发A1+A3)
This commit is contained in:
336
doc/设计文档/网关自动注册机制整改方案_v1.0.md
Normal file
336
doc/设计文档/网关自动注册机制整改方案_v1.0.md
Normal file
@@ -0,0 +1,336 @@
|
||||
# 网关 ↔ Vol.Pro 自动注册机制整改方案 v1.0
|
||||
|
||||
> **版本**: 1.0
|
||||
> **日期**: 2026-06-03
|
||||
> **基准**: `doc/设计文档/网关自动注册机制检查报告20260603.md`
|
||||
> **改动范围**: `gateway/Program.cs` + `VolPro/gateway_nodesService.cs` + `VolPro/base_deviceService.cs`
|
||||
|
||||
---
|
||||
|
||||
## 1. 整改步骤
|
||||
|
||||
### 步骤 1: 修复网关 A1 BaseUrl(预计 10min)
|
||||
|
||||
**文件**: `gateway/src/IntegrationGateway.Host/Program.cs`
|
||||
|
||||
**当前代码**(line 100-101):
|
||||
```csharp
|
||||
BaseUrl = $"http://localhost:{app.Urls.FirstOrDefault()?.Split(':').LastOrDefault() ?? "5100"}"
|
||||
```
|
||||
|
||||
**修改为**:
|
||||
```csharp
|
||||
// 优先读取 Gateway:SelfUrl 配置,不填时自动从 Urls 取端口
|
||||
var port = app.Urls.FirstOrDefault()?.Split(':').LastOrDefault() ?? "5100";
|
||||
var selfUrl = gwCfg["SelfUrl"] ?? $"http://localhost:{port}";
|
||||
```
|
||||
|
||||
然后将 `BaseUrl =` 行改为:
|
||||
```csharp
|
||||
BaseUrl = selfUrl
|
||||
```
|
||||
|
||||
**appsettings.json 补充**:
|
||||
```json
|
||||
"Gateway": {
|
||||
"SelfUrl": null, // 生产环境填真实IP: http://192.168.1.100:5100, 留空则用 localhost
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
**编译验证**: `dotnet build gateway/IntegrationGateway.slnx` → 0 错误。
|
||||
|
||||
---
|
||||
|
||||
### 步骤 2: A1 注册后立即调用 A3 设备同步(预计 30min)
|
||||
|
||||
**文件**: `gateway/src/IntegrationGateway.Host/Program.cs`
|
||||
|
||||
**在 A1 注册成功后追加 A3 同步**。当前代码(line 97-105)替换为:
|
||||
|
||||
```csharp
|
||||
try
|
||||
{
|
||||
var registerReq = new GatewayRegisterRequest
|
||||
{
|
||||
NodeCode = nodeCode, Token = nodeToken,
|
||||
AdapterTypes = adapterTypes, BaseUrl = selfUrl
|
||||
};
|
||||
await clientFactory.RegisterAsync(registerReq);
|
||||
Console.WriteLine($"[Gateway] A1 注册完成: nodeCode={nodeCode}, adapters={adapterTypes}");
|
||||
|
||||
// ── A3: 同步所有适配器设备到 Vol.Pro ──
|
||||
var allDevices = new List<object>();
|
||||
foreach (var adapter in registry.All)
|
||||
{
|
||||
try
|
||||
{
|
||||
if (adapter is IHasFlatDevices flat)
|
||||
{
|
||||
var result = await flat.GetDevicesAsync(1, 1000);
|
||||
foreach (var item in result.Items)
|
||||
{
|
||||
// 映射为 A3 接口期望的格式
|
||||
allDevices.Add(new
|
||||
{
|
||||
AdapterCode = item.AdapterCode,
|
||||
SourceId = item.SourceId,
|
||||
Name = item.Name,
|
||||
Category = item.Category,
|
||||
Group = item.Group,
|
||||
IsParent = item.IsParent,
|
||||
ParentSourceId = item.ParentSourceId,
|
||||
IsOnline = item.IsOnline,
|
||||
IpAddress = item.IpAddress,
|
||||
Port = item.Port,
|
||||
ExtraDataJson = item.Extra != null
|
||||
? System.Text.Json.JsonSerializer.Serialize(item.Extra)
|
||||
: null
|
||||
});
|
||||
}
|
||||
}
|
||||
else if (adapter is IHasOwnDeviceTree tree)
|
||||
{
|
||||
var nodes = await tree.GetObjectTreeAsync();
|
||||
FlattenTree(allDevices, nodes, adapter.AdapterCode, null);
|
||||
}
|
||||
}
|
||||
catch (Exception ex) { Console.Error.WriteLine($"[Gateway] A3 同步失败: {adapter.AdapterCode} {ex.Message}"); }
|
||||
}
|
||||
|
||||
if (allDevices.Any())
|
||||
{
|
||||
await clientFactory.SyncDevicesAsync(nodeCode, nodeToken, allDevices);
|
||||
Console.WriteLine($"[Gateway] A3 设备同步完成: {allDevices.Count} 台设备");
|
||||
}
|
||||
}
|
||||
catch (Exception ex) { Console.Error.WriteLine($"[Gateway] A1 注册失败: {ex.Message}"); }
|
||||
```
|
||||
|
||||
**新增辅助函数**(Program.cs 底部,app.Run() 前):
|
||||
```csharp
|
||||
/// <summary>递归展平 MC4 对象树为设备列表</summary>
|
||||
void FlattenTree(List<object> devices, List<DeviceTreeNode> nodes, string adapterCode, string? parentSourceId)
|
||||
{
|
||||
foreach (var n in nodes)
|
||||
{
|
||||
devices.Add(new
|
||||
{
|
||||
AdapterCode = adapterCode,
|
||||
SourceId = n.SourceId,
|
||||
Name = n.Name ?? n.SourceId,
|
||||
Category = n.Tag ?? "IoT设备",
|
||||
Group = "IoT设备",
|
||||
IsParent = n.Type == 1,
|
||||
ParentSourceId = parentSourceId,
|
||||
IsOnline = true,
|
||||
IpAddress = (string?)null,
|
||||
Port = (int?)null,
|
||||
ExtraDataJson = n.Option != null
|
||||
? System.Text.Json.JsonSerializer.Serialize(n.Option)
|
||||
: null
|
||||
});
|
||||
if (n.Children?.Count > 0)
|
||||
FlattenTree(devices, n.Children, adapterCode, n.SourceId);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**编译验证**: `dotnet build` → 0 错误。
|
||||
|
||||
---
|
||||
|
||||
### 步骤 3: 启动 A2 后台心跳循环(预计 15min)
|
||||
|
||||
**文件**: `gateway/src/IntegrationGateway.Host/Program.cs`
|
||||
|
||||
**在 A1-A3 注册/同步后追加**:
|
||||
```csharp
|
||||
// ── A2: 启动后台心跳(每 15 秒)──
|
||||
var heartbeatInterval = int.TryParse(gwCfg["HeartbeatIntervalSec"], out var sec) ? sec : 15;
|
||||
_ = Task.Run(async () =>
|
||||
{
|
||||
using var timer = new PeriodicTimer(TimeSpan.FromSeconds(heartbeatInterval));
|
||||
while (await timer.WaitForNextTickAsync())
|
||||
{
|
||||
try
|
||||
{
|
||||
await clientFactory.HeartbeatAsync(new GatewayHeartbeatRequest
|
||||
{
|
||||
NodeCode = nodeCode, Token = nodeToken
|
||||
});
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
Console.Error.WriteLine($"[Gateway] A2 心跳失败: {ex.Message}");
|
||||
_auth?.Invalidate(); // 心跳连续失败时考虑重新注册
|
||||
}
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
**appsettings.json** 已有 `"HeartbeatIntervalSec": 15`,无需改动。
|
||||
|
||||
**编译验证**: `dotnet build` → 0 错误。
|
||||
|
||||
---
|
||||
|
||||
### 步骤 4: 修复 RegisterNodeAsync 语法(预计 5min)
|
||||
|
||||
**文件**: `api_sqlsugar/Warehouse/Services/device_manager/Partial/gateway_nodesService.cs`
|
||||
|
||||
**当前代码** (~line 55):
|
||||
```csharp
|
||||
var existing = _repository.DbContext.Queryable<gateway_nodes>()
|
||||
.First(x => x.NodeCode == nodeCode);
|
||||
```
|
||||
|
||||
**修改为**:
|
||||
```csharp
|
||||
var existing = await _repository.FindAsIQueryable(x => x.NodeCode == nodeCode)
|
||||
.FirstOrDefaultAsync();
|
||||
```
|
||||
|
||||
**同时修改 heartbeat 方法** (~line 92):
|
||||
```csharp
|
||||
var entity = _repository.DbContext.Queryable<gateway_nodes>()
|
||||
.First(x => x.NodeCode == nodeCode && x.NodeToken == token);
|
||||
```
|
||||
→
|
||||
```csharp
|
||||
var entity = await _repository.FindAsIQueryable(x => x.NodeCode == nodeCode && x.NodeToken == token)
|
||||
.FirstOrDefaultAsync();
|
||||
```
|
||||
|
||||
**编译验证**: `dotnet build api_sqlsugar/Warehouse` → 0 错误。
|
||||
|
||||
---
|
||||
|
||||
### 步骤 5: 标记重复的 Upsert 逻辑(预计 10min)
|
||||
|
||||
**文件**: `api_sqlsugar/Warehouse/Services/device_manager/Partial/base_deviceService.cs`
|
||||
|
||||
**在 `UpsertDeviceAsync` 方法上加 `[Obsolete]` 标记**:
|
||||
```csharp
|
||||
/// <summary>
|
||||
/// [已废弃] 设备同步逻辑已迁移至 gateway_nodesService.SyncDevicesAsync。
|
||||
/// 保留此方法仅供向后兼容,新代码请勿使用。
|
||||
/// </summary>
|
||||
[Obsolete("已迁移至 gateway_nodesService.SyncDevicesAsync")]
|
||||
public async Task UpsertDeviceAsync(SyncDeviceItem d, int gatewayNodeId, Dictionary<(string, string), int> existingIds)
|
||||
```
|
||||
|
||||
**同时检查 `Ibase_deviceService` 接口是否暴露了此方法** — 如是的 `Igateway_nodesService` 和 `Ibase_deviceService` 分别在两个 Partial 文件中,确认死代码无外部调用后可直接注释。
|
||||
|
||||
**编译验证**: `dotnet build` → 0 错误 / 仅 [Obsolete] 警告。
|
||||
|
||||
---
|
||||
|
||||
## 2. 改动文件汇总
|
||||
|
||||
| 步骤 | 文件 | 改动类型 | 影响 |
|
||||
|:---:|------|:---:|------|
|
||||
| 1 | `gateway/Program.cs` | 修改 BaseUrl 取值逻辑 | 生产部署可用真实 IP |
|
||||
| 1 | `gateway/appsettings.json` | 新增 SelfUrl 字段 | 可选配置 |
|
||||
| 2 | `gateway/Program.cs` | 追加 A3 同步 + FlattenTree | 首次注册即有设备 |
|
||||
| 3 | `gateway/Program.cs` | 追加热心跳循环 | 网关持续在线 |
|
||||
| 4 | `VolPro/gateway_nodesService.cs` | 替换 Queryable → FindAsIQueryable | 代码规范一致 |
|
||||
| 5 | `VolPro/base_deviceService.cs` | 加 [Obsolete] 标记 | 消除重复逻辑 |
|
||||
|
||||
---
|
||||
|
||||
## 3. 编译顺序
|
||||
|
||||
```
|
||||
步骤1-3: gateway → dotnet build gateway/IntegrationGateway.slnx
|
||||
步骤4-5: VolPro → dotnet build api_sqlsugar/Warehouse
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. 验证清单
|
||||
|
||||
| 场景 | 预期 |
|
||||
|------|------|
|
||||
| 网关启动 | A1 注册成功 + A3 同步 N 台设备 + A2 心跳开始计时 |
|
||||
| `gateway_nodes` 表 | 新增/更新记录,BaseUrl 为真实 IP |
|
||||
| `base_device` 表 | 网关对应设备的 NodeId 已回填 |
|
||||
| 管理端设备列表 | 可看到 Owl/MC4/KMS 设备 |
|
||||
| 30 秒后 | 网关保持在线状态(LastHeartbeat 持续更新) |
|
||||
| 网关重启 | NodeCode 不变 → A1 Upsert 更新 → A3 重新同步 |
|
||||
|
||||
|
||||
---
|
||||
|
||||
## 5. 补充: A2 心跳 + 自动重注册机制(步骤3增强版)
|
||||
|
||||
> **日期**: 2026-06-03
|
||||
> **问题**: 网关先于 Vol.Pro 启动时,A1 注册失败后不重试,网关永久不可见。
|
||||
|
||||
### 5.1 增强后的步骤3代码
|
||||
|
||||
替换原步骤3的简单心跳为「心跳 + 连续失败自动重注册」:
|
||||
|
||||
```csharp
|
||||
// ── A2: 心跳 + 自动重注册 ──
|
||||
var heartbeatInterval = int.TryParse(gwCfg["HeartbeatIntervalSec"], out var sec) ? sec : 15;
|
||||
var failCount = 0;
|
||||
var maxFails = 3;
|
||||
|
||||
_ = Task.Run(async () =>
|
||||
{
|
||||
using var timer = new PeriodicTimer(TimeSpan.FromSeconds(heartbeatInterval));
|
||||
while (await timer.WaitForNextTickAsync())
|
||||
{
|
||||
try
|
||||
{
|
||||
await clientFactory.HeartbeatAsync(new GatewayHeartbeatRequest
|
||||
{ NodeCode = nodeCode, Token = nodeToken });
|
||||
failCount = 0;
|
||||
}
|
||||
catch
|
||||
{
|
||||
failCount++;
|
||||
Console.Error.WriteLine($"[Gateway] A2 心跳失败 ({failCount}/{maxFails})");
|
||||
if (failCount >= maxFails)
|
||||
{
|
||||
Console.WriteLine("[Gateway] 心跳连续失败, 尝试重新注册...");
|
||||
try
|
||||
{
|
||||
await clientFactory.RegisterAsync(new GatewayRegisterRequest
|
||||
{ NodeCode = nodeCode, Token = nodeToken, AdapterTypes = adapterTypes, BaseUrl = selfUrl });
|
||||
await SyncAllDevicesAsync();
|
||||
failCount = 0;
|
||||
Console.WriteLine("[Gateway] 重新注册成功");
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
Console.Error.WriteLine($"[Gateway] 重新注册失败: {ex.Message}");
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
### 5.2 重注册时序
|
||||
|
||||
```
|
||||
网关启动 → Vol.Pro 离线 → A1 失败(仅日志) → A2 心跳循环启动(每15s)
|
||||
→ 15s: 心跳失败 (1/3)
|
||||
→ 30s: 心跳失败 (2/3)
|
||||
→ 45s: 心跳失败 (3/3) → 触发 A1+A3 重注册 → 成功!
|
||||
→ 60s: 心跳成功 (failCount=0) → 恢复正常
|
||||
```
|
||||
|
||||
### 5.3 验证场景新增
|
||||
|
||||
| 场景 | 预期 |
|
||||
|------|------|
|
||||
| 网关先于 Vol.Pro 启动 | 45 秒后自动 A1+A3 重注册成功 |
|
||||
| Vol.Pro 重启 | 网关检测到心跳失败 → 自动重新上线 |
|
||||
| 网关正常运行中 | 心跳持续成功,failCount=0 |
|
||||
|
||||
### 5.4 步骤3预计耗时更新
|
||||
|
||||
原 15min → 20min(增加 SyncAllDevicesAsync 辅助函数和重注册分支)。
|
||||
Reference in New Issue
Block a user