少女祈祷中...

Home

Archives

About

Friend

Tags

RL

博客搭建

Automata

Codeforces

Compilers

文艺作品补完计划

Algorithms

misc

RL

ML

hexo

ControlSystem

obsidian

Ubuntu

Complier

stable-baselines3

PyTorch

Conda

poems

RLHF

docker

env

conda

Lutris

fcitx

mujoco python

WorkLog

Note

LearnLog

Eassy

2024

12-03
[RL] TRPO 和 PPO
11-11
[RL] PyTorch实现RL框架算法及 DQN
11-08
[PyTorch] 关于自动求导机制以及优化器的工作原理
11-07
[RL] stable-baselines3实现DQN, double DQN, Rainbow, DDPG, TD3, SAC, TRPO, PPO
11-05
[RL] 第八讲: 深度策略梯度
11-05
[RL] 第七讲: 深度强化学习
11-05
[RL] 第六讲: 价值和策略近似逼近方法
11-02
[RL] 第三讲: 值函数估计
11-02
[RL] 第五讲: 规划学习
11-02
[RL] 第一讲: 强化学习, 探索与利用
11-02
[misc] RL第二讲的相关证明
11-02
[RL] 第四讲: 无模型控制方法
11-02
[RL] 第二讲: 马尔科夫决策过程

RIKKA421

Posts

75

Categories

4

Tags

25

Home

Archives

About

Friend

Categories

Eassy
LearnLog
Note
WorkLog

Tags

Algorithms
Automata
Codeforces
Compilers
Complier
Conda
ControlSystem
Lutris
ML
PyTorch
RL
RLHF
Ubuntu
conda
docker
env
fcitx
hexo
misc
mujoco python
obsidian
poems
stable-baselines3
博客搭建
文艺作品补完计划

Recent Posts

[misc] 25-03 那些我看到的
[misc] 25-03 那些我看到的
[Automata] Ch9 Petri网络 PN
[Automata] Ch9 变迁系统 TS
[Automata] Ch8 图灵机 TM

2020-2025 RIKKA421

Powered by Hexo Theme.Reimu

142k | 08:58

Number of visits | Number of visitors

RIKKA421

Posts

75

Categories

4

Tags

25

Home

Archives

About

Friend