Abracadabra

Do it yourself


  • Home

  • Categories

  • About

  • Archives

  • Tags

  • Sitemap

  • 公益404

  • Search
close
Abracadabra

Variational Discreminator Bottlenect [ICLR 2019 Peng et al.]

Posted on 2019-03-20 | | Visitors

简介

对抗学习方法今年来被广泛应用于各领域中,但其训练过程极不稳定。由于判别器过于准确将会使得其产生的梯度包含的信息过少从而不能有效地对生成器进行训练,因而有效地平衡判别器以及生成器的性能至关重要。在这篇文章中,作者提出了一个简单但通用的方式来对判别器接收到的信息流采用信息瓶颈进行约束。通过对判别器的内部状态以及输入的原始数据之间的互信息施加约束可以有效地控制判别器的准确度从而使得其产生的梯度能够包含对生成器训练更加具有指导意义的信息。作者提出的变分判别器瓶颈能够显著提升模仿学习以及逆强化学习算法的特性,当然由于其通用性,任何对抗生成模型均可从中受益。

变分信息瓶颈

我们从监督学习中的变分信息瓶颈出发。对于监督学习中一个分类任务,存在以下优化目标:
$$
\min_q \mathbb{E}_{x,y\sim p(x,y)}\left[ -\log q(y|x) \right].
$$
然而,优化上述目标容易使得训练出的模型过拟合。引入信息瓶颈可以使得模型只关注于输入数据中更加具有判别性的特征。首先我们一如一个编码器$\mathbb{E}(z|x)$将输入数据$x$映射到一个隐含分布中,然后对于编码后的数据以及原数据之间的互信息$I(X,Z)$的上界施加约束,即可得到下面的优化目标:
$$
\begin{align}
J(q,E)=&\min_{q,E} \;\;\mathbb{E}_{x,y \sim p(x,y)} \left[ \mathbb{E}_{z \sim E(z|x)} \left[ -\log q(y|z) \right] \right] \nonumber \\
&\text{s.t.}\;\;\;\;I(X,Z) \leq I_c.
\end{align}
$$
我们可以通过变分方法引入互信息的上界,从而推导出上述优化目标的上界,最后通过拉格朗日乘子法将上述带约束的优化问题转变为一个无约束的优化问题,具体推导过程见下图:

推导过程

推导过程

变分判别器瓶颈

接着我们将上述变分信息瓶颈引入到一个标准的生成对抗网络的判别器损失函数上:

变分判别器瓶颈

由于一些生成对抗模仿学习以及对抗逆强化学习算法均采用以上的生成对抗框架,因而可以引入以上变分判别器瓶颈来增强性能。

讨论

为何引入变分判别器瓶颈可以提高生成对抗模型的性能呢?在生成对抗学习中,如果真实数据分布与生成数据分布具有不相交的支撑集时,一个最优的判别器能够完美分辨两个分布并且其梯度几乎处处为零。因而,当判别器收敛到最优性能时,用以训练生成器的梯度会因此消失。目前一种解决此问题的方法是对判别器的输入数据增加一些连续的噪声,因而使得两个分布在任何地方都拥有连续的支撑集。但是实际上,如果两个分布的距离很大时,增加噪声几乎没有影响。而引入变分判别器瓶颈时,首先编码器将输入映射到一个嵌入空间中并对嵌入表示施加信息瓶颈约束,使得两个分布不仅具有共享的支撑集而且分布之间存在明显的重合(距离不大),同时由于引入信息瓶颈与引入噪声部分等同,使得上述问题得以解决。

Abracadabra

Accelerate your pandas workflows by changing one line of code

Posted on 2018-10-26 | | Visitors
Abracadabra

Curiosity-Driven Learning made easy Part I (Repost)

Posted on 2018-10-17 | | Visitors

Curiosity-Driven Learning made easy Part I

This article is part of Deep Reinforcement Learning Course with Tensorflow 🕹️. Check the syllabus here.

img

OpenAI Five contest

In the recent years, we’ve seen a lot of innovations in Deep Reinforcement Learning. From DeepMind and the Deep Q learning architecture in 2014 to OpenAI playing Dota2 with OpenAI five in 2018, we live in an exciting and promising moment.

And today we’ll learn about Curiosity-Driven Learning, one of the most exciting and promising strategy in Deep Reinforcement Learning.

Reinforcement Learning is based on the reward hypothesis, which is the idea that each goal can be described as the maximization of the rewards. However, the current problem of extrinsic rewards (aka rewards given by the environment) is that this function is hard coded by a human, which is not scalable.

The idea of Curiosity-Driven learning, is to build a reward function that is intrinsic to the agent (generated by the agent itself). It means that the agent will be a self-learner since he will be the student but also the feedback master.

Read more »
Abracadabra

Validation Checklist in Kaggle Competition

Posted on 2018-10-16 | | Visitors

Data Splitting Strategies

  • Random
  • Timewise
  • By id (maybe hidden)
  • Combined

Notices

  • Make sure the strategy used by train/val splitting is same as train/test splitting.
  • Different models trained from different data splitting strategies have much performance gap.
  • Logic of feature generation depends on the data splitting strategy.

Validation problems

  • Validation stage
    • Causes of different scores and optimal parameters
      • Too little data
      • Too diverse and inconsistent data
    • Solutions
      • Average scores from different K-Fold splits
      • Tune model on one split and evaluate score on the other
  • Submission stage
    • We can observe that
      • LB score is consistently higher/lower than validation score
      • LB score is not correlated with validation score at all
    • Causes
      • We may already have quite different scores in K-Fold
        • make sure split train/validation correct
      • too litter data in public LB
        • Just trust your validation scores
      • train and test data are from different distributions
        • classes show in the test set not show in the train set
          • make a shift to your prediction (mean of train minus mean of test) – LB probing
        • classes ratio is not same
          • make the validation classes ratio is same as test classes ratio

Expect LB shuffle because of

  • Randomness
  • Litter amount of data
  • Different public/private distributions
Abracadabra

Coursera-dl plugin issues on Windows 10

Posted on 2018-10-16 | | Visitors

I can't download the video ~

state: closed opened by: jenkey2011 on: 2017-05-04

I can't download the video ~

Your environment

  • win10
  • Python version : 3.6
  • coursera-dl version:0.8
  • PS:I'm from China……

    Steps to reproduce

    > coursera-dl -u xxx -p xxxx -b html-css-javascript

Then it works , but only the subtitles were downloaded ; And the cmd shows "The following URLs (64) could not be downloaded:" , they're all video links;

Comments


from: wanghoppe on: 2017-05-17

Me too have the issue…. Could someone help??

from: lvhuiyang on: 2017-05-18

I met the same problem yesterday.

I use the command `coursera-dl -u xx@xxx.com -p xxx course_name –wget` to solve it.

(env: Mac OS, wget, python3.6)

from: balta2ar on: 2017-05-18

Are you guys all from China? Did you try downloading over a VPN or Tor? If it's your government's firewall, we can't do anything about it, use proxies, VPNs and Tor.

from: jenkey2011 on: 2017-05-19

I guess so…… What a pity.

from: FBryce on: 2017-06-02

Hi, If you are from China, adding "52.84.246.72 d3c33hcgiwev3.cloudfront.net" in the host file and fresh dns with " ipconfig/flushdns" may work

from: wanghoppe on: 2017-06-02

Read more »
123…27
Ewan Li

Ewan Li

Ewan's IT Blog

131 posts
64 tags
RSS
Github Twitter
© 2019 Ewan Li
Powered by Hexo
Theme - NexT.Mist
本站访客数人次 本站总访问量次