<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>표군의 논문 읽는 블로그</title>
    <link>https://pyonyo.tistory.com/</link>
    <description>I am an astronaut in my universe.</description>
    <language>ko</language>
    <pubDate>Sat, 6 Jun 2026 21:05:16 +0900</pubDate>
    <generator>TISTORY</generator>
    <ttl>100</ttl>
    <managingEditor>표군</managingEditor>
    <image>
      <title>표군의 논문 읽는 블로그</title>
      <url>https://tistory1.daumcdn.net/tistory/5218532/attach/c9609f77d4d04bfa8950745e94f2787d</url>
      <link>https://pyonyo.tistory.com</link>
    </image>
    <item>
      <title>[공정제어] Model-based control for column-based continuous viral inactivation of biopharmaceuticals</title>
      <link>https://pyonyo.tistory.com/3</link>
      <description>&lt;p style=&quot;text-align: justify;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;font-family: 'Noto Serif KR';&quot;&gt;&lt;b&gt;저자 : Moo Sun Hong, Amos E. Lu, Richard D. Braatz et al.&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p style=&quot;color: #333333; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;font-family: 'Noto Serif KR';&quot;&gt;&lt;b&gt;링크 : &lt;a href=&quot;https://doi.org/10.1002/bit.27846&quot;&gt;https://doi.org/10.1002/bit.27846&lt;/a&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;생물공정의 단계 중 일부인 viral inactivation의 연속 공정의 제어 설계를 다루는 논문 리뷰입니다.&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;대지 4@600x-100.jpg&quot; data-origin-width=&quot;1891&quot; data-origin-height=&quot;1891&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/btWgMP/btsLdm40dAO/Q4nBqGgE1K2TpbIpgof7KK/img.jpg&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/btWgMP/btsLdm40dAO/Q4nBqGgE1K2TpbIpgof7KK/img.jpg&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/btWgMP/btsLdm40dAO/Q4nBqGgE1K2TpbIpgof7KK/img.jpg&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbtWgMP%2FbtsLdm40dAO%2FQ4nBqGgE1K2TpbIpgof7KK%2Fimg.jpg&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;433&quot; height=&quot;433&quot; data-filename=&quot;대지 4@600x-100.jpg&quot; data-origin-width=&quot;1891&quot; data-origin-height=&quot;1891&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size23&quot;&gt;&lt;span style=&quot;font-family: GungSeo, serif;&quot;&gt;&lt;b&gt;1. Introduction&lt;/b&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;기존의 생물공정들은 대부분 batchwise로 진행되었으나, 비용 절감과 유연성 및 품질 개선 등의 이유로 인해 &lt;b&gt;batch process를 continuous process로 전환하려는 시도&lt;/b&gt;가 최근 이루어졌음.&lt;/li&gt;
&lt;li&gt;이러한 시도는 생물반응기나 크로마토그래피에 대해서는 활발히 진행되었지만, &lt;b&gt;viral removal process&lt;/b&gt;는 그에 비해 관심을 적게 받았음. Viral removal process란 master cell bank(선택한 세포 클론에서 얻은 세포들을 하나로 회수하여 합한 것)에 포함될 수 있는 &lt;b&gt;바이러스나 바이러스 유사 입자를 제거하고 비활성화하는 단계&lt;/b&gt;로, 대표적 예시로 batch low-$\text{pH}$ hold가 있음.&lt;/li&gt;
&lt;li&gt;&quot;Downstream processing of monoclonal antibodies&amp;mdash;Application of platform approaches&quot;(Shukla et al., 2007)에 따르면 단일클론항체(mAb)가 낮은 $\text{pH}$에서 안정하므로 low-$\text{pH}$ viral inactivation 공정을 통해 다양한 생물공정 생성물 내의 레트로바이러스를 비활성화할 수 있음. 이후 용액 중화를 위해 염기를 가함. 이 때 용액의 $\text{pH}$ 유지를 위해 &lt;b&gt;buffering species의 $\text{pK}_a$에 대한 고려가 중요&lt;/b&gt;함.&lt;/li&gt;
&lt;li&gt;Low-$\text{pH}$ hold 공정의 중요 변수(CPP)는 &lt;b&gt;pH와 체류시간분포(RTD)&lt;/b&gt;임. Low-$\text{pH}$ hold 공정의 연속화를 위한 다양한 방법들이 제시되었지만 구체적인 공정 시스템의 제어에 대해서는 조사된 바가 없음.&lt;/li&gt;
&lt;li&gt;본 연구에서 제안한 column-based continuous viral inactivation system은 &lt;b&gt;모델(수식)을 기반으로 한 $\text{pH}$의 피드백 제어를 통해 공정의 빠른 startup과 외란 요소 제거가 가능함&lt;/b&gt;. 또한 UV-transparent한 inverse tracer를 주기적으로 주입하여 RTD를 추정하고, 이를 통해 최소체류시간(MRT)를 추정하여 feed flow rate을 조정할 수 있음.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size23&quot;&gt;&lt;span style=&quot;font-family: GungSeo, serif;&quot;&gt;&lt;b&gt;2. Materials and Methods&lt;/b&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;본 연구에서 사용한 lab-scale의 continuous viral inactivation system의 구조는 아래와 같음.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;2406&quot; data-origin-height=&quot;790&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cGJAYm/btsLeJLlN5U/b7QLO2UsBnQZM9kqhQAxak/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cGJAYm/btsLeJLlN5U/b7QLO2UsBnQZM9kqhQAxak/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cGJAYm/btsLeJLlN5U/b7QLO2UsBnQZM9kqhQAxak/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcGJAYm%2FbtsLeJLlN5U%2Fb7QLO2UsBnQZM9kqhQAxak%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;2406&quot; height=&quot;790&quot; data-origin-width=&quot;2406&quot; data-origin-height=&quot;790&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;Pump로는 multi-channel peristaltic pump(정밀유량제어 튜브연동식펌프)를 사용하였음.&lt;/li&gt;
&lt;li&gt;In-line mixer를 의해 input tank에서 나온 solution과 acid를 혼합하여 $\text{pH}$를 낮춤.&lt;/li&gt;
&lt;li&gt;산 혼합 용액은 in-line $\text{pH}$&amp;nbsp;전극을 통해 용액의 $\text{pH}$&amp;nbsp;정보가 컴퓨터로 전송됨.&lt;/li&gt;
&lt;li&gt;$\text{pH}$ 전극을 통과한 용액은 injection tank로부터 주입된 UV-transparent tracer(DIW)와 혼합된 후 inert glass로 충진된 칼럼으로 이동함. 칼럼 전후로 UV 흡광도 센서가 설치되어 용액의 column transit을 정량적으로 측정함.&lt;/li&gt;
&lt;li&gt;칼럼을 빠져나온 용액은 in-line mixer에 의해 염기와 혼합되어 $\text{pH}$가 다시 올라가고, 용액의 $\text{pH}$는 $\text{pH}$ 전극에 의해 측정됨.&lt;/li&gt;
&lt;li&gt;시스템을 제어하는 PLC(programmable logic controller)에는&lt;span style=&quot;color: #333333; text-align: left;&quot;&gt;&lt;span&gt; $\text{pH}$&lt;/span&gt;&amp;nbsp;데이터의 평탄화를 위해&lt;/span&gt; low-pass filter가 적용되었음.&lt;/li&gt;
&lt;li&gt;$\text{pH}$ 제어를 위해서는 &lt;b&gt;베이즈 추정법&lt;/b&gt;이 사용되었음. 자세한 내용은 3장에 계속됨.&lt;/li&gt;
&lt;li&gt;외피형 바이러스(enveloped virus)의 surrogate로 &lt;b&gt;Phi6 박테리오파지&lt;/b&gt;가 사용되었음. Phi6 박테리오파지는 비병원성이고, mammlian enveloped virus와 성질이 비슷하며, 배양과 분석이 쉽고 싸다는 장점이 있음. Phi6 박테리오파지의 숙주 박테리아로는 &lt;b&gt;P. syringae&lt;/b&gt;를 사용하였음.&lt;/li&gt;
&lt;li&gt;P. syringae를 &lt;b&gt;지수생장기&lt;/b&gt;(&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;숙주 세포가 충분히 활성화된 상태&lt;/span&gt;)까지 배양한 후 Phi6 박테리오파지에 감염시켰음. 24시간 후 &lt;b&gt;bovine pancreatic DNase I&lt;/b&gt;(&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;숙주 DNA를 분해하여 불순물을 제거하는 역할&lt;/span&gt;)를 가한 후 실온에서 incubate하였음. 이후 &lt;b&gt;NaCl&lt;/b&gt;(&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;불용성 물질을 침전&lt;/span&gt;)을 가하여 얼음 속에서 1시간 동안 incubate하고 용액을 &lt;b&gt;원심분리&lt;/b&gt;(&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;NaCl로 침전된 숙주 세포 잔해와 불순물을 제거&lt;/span&gt;)시켰음. 그리고 &lt;b&gt;폴리에틸렌글리콜&lt;/b&gt;을 가한 후 용액을 다시 &lt;b&gt;원심분리&lt;/b&gt;해 lysate 내의 파지를 석출시켰음. 석출된 파지는 &lt;b&gt;SM buffer&lt;/b&gt;(&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;박테리오파지를 안정적으로 보관&lt;/span&gt;)와 &lt;b&gt;CsCl&lt;/b&gt;(밀도구배 형성) 혼합 용액을 가하여 &lt;b&gt;resuspend&lt;/b&gt;되고, 이후 용액을 &lt;b&gt;초고속 원심분리&lt;/b&gt;하여 &lt;b&gt;band fraction&lt;/b&gt;을 수집함. 얻어진 파지 샘플은 3일 간 &lt;b&gt;투석&lt;/b&gt;(&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;CsCl과 같은 염이나 PEG 등 남아 있는 작은 분자 불순물을 제거&lt;/span&gt;)되고 &lt;b&gt;membrane filter&lt;/b&gt;(&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;잔여 세포 잔해나 기타 큰 입자를 제거&lt;/span&gt;)에 통과시켰음.&lt;/li&gt;
&lt;li&gt;Phi6 파지의 titer(농도)는 &lt;b&gt;phage plaque assay&lt;/b&gt;를 통해 분석하였음. Buffer를 이용해 시료를 1:10 비율로 단계적으로 희석한 후, 희석된 시료를 굳은 LB 배지에 접종함. 이 위에 하루 동안 배양된 LB 배지와 P. syringae 혼합 용액을 고르게 끼얹어준 후 실온에서 건조시켜 하루 동안 배양시킴. 이후 파지가 박테리아를 용해하여 생성된 plaque의 개수를 세어 phage titer 및 &lt;b&gt;LRV&lt;/b&gt;(&lt;span style=&quot;color: #333333; text-align: left;&quot;&gt;logarithmic reduction value;&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;바이러스, 미생물, 또는 기타 오염물질의 제거 정도를 측정하는 데 사용되는 지표&lt;/span&gt;)를 계산할 수 있음.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size23&quot;&gt;&lt;span style=&quot;font-family: GungSeo, serif;&quot;&gt;&lt;b&gt;3. Results and Discussions&lt;/b&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;적정 곡선의 비선형성을 제어하기 위해 m&lt;span style=&quot;color: #333333; text-align: left;&quot;&gt;odel-based reaction-invariant controller가 사용되었음. 칼럼 내에 일정한 flow rate을 유지하기 위해 제어기는 flow rate 대신 &lt;b&gt;flow ratio를 조절&lt;/b&gt;함. 또한 &lt;b&gt;베이즈 추정법&lt;/b&gt;을 이용해 $\text{pH}$ 센서에서 얻은 데이터로부터 &lt;b&gt;용액의 $\text{pK}_a$와 농도를 업데이트&lt;/b&gt;함. 이를 통해 &lt;b&gt;실시간으로 산과 염기의 투입량을 예측&lt;/b&gt;할 수 있으며, 구체적 제어 방법은 &quot;pH and Conductivity Control in an Integrated Biomanufacturing Plant&quot;(Lu et al., 2016)을 참고하였음.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;본 연구에서는 좁은 체류시간 분포를 갖는 packed-bed column-based system을 채택하였음. 칼럼 내에는 비활성 유리 비드를 충진하였음. 칼럼은 &lt;b&gt;낮은 수준의 압력강하와 분산을 유지하면서 용액 처리량과 체류시간의 요구치를 만족&lt;/b&gt;하도록 설계되었음. 칼럼의 flow rate(처리량)와 average residence time은 아래 식으로 나타낼 수 있으므로 칼럼의 지름 $d$와 길이 $L$을 결정할 수 있음.&lt;/li&gt;
&lt;/ul&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;$Q = \frac{\pi d^2 v_s}{4}$, &amp;nbsp;$\tau = \frac{L}{u} = \frac{\epsilon L}{v_s}$&lt;/p&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;($v_s$는 superficial velocity, $\epsilon$은 칼럼 공극률)&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;Packing size는 Kozeny-Carman equation을 통해 얻을 수 있으며, 압력강하 $\Delta P$와 packing 및 dispersion에 의한 피크의 표준편차 $\sigma$는 아래와 같이 나타낼 수 있음. &lt;b&gt;압력강하 조건을 만족하면서도 피크의 퍼짐을 최소화&lt;/b&gt;하기 위하여 패킹의 크기와 superficial velocity를 둘 다 작게 설정하였음.&lt;/li&gt;
&lt;/ul&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;$\Delta P = \frac{180 \mu \tau (1 - \epsilon)^2}{\phi_s^2 \epsilon^4} \left( \frac{v_s}{d_p} \right)^2$, &amp;nbsp;$\sigma = 2 \sqrt{\tau \frac{d_p}{v_s}}$&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;산을 가하여 $\text{pH}$를 낮춘 용액은 충분한 시간 동안 incubate되어야 하지만, &lt;b&gt;너무 길게 incubate할 경우 응집 등에 의해 오히려 품질을 저하&lt;/b&gt;시킬 수 있음. 따라서 이러한 tradeoff에 대한 정량적 이해가 수반되어야 하며, 이는 &lt;b&gt;RTD를 통해 분석&lt;/b&gt;할 수 있음. 본 시스템은 PFR과 CSTR의 연결처럼 모델링할 수 있으며, 각 반응기의 확률분포함수 $E$와 inlet concentration $C_{\text{in}}$을 convolution하여 &lt;span style=&quot;color: #333333; text-align: left;&quot;&gt;inlet concentration $C_{\text{out}}$을&lt;span&gt; 계산할 수 있음.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #333333; text-align: left;&quot;&gt;&lt;span&gt;$C_{\text{out}}(\theta)=C_{\text{in}}(\theta) * E(\theta, \text{Pe}, x)= \int_0^\theta C_{\text{in}}(\theta - \theta ')E(\theta ', \text{Pe}, x) d \theta '$&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;color: #333333; text-align: left;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #333333; text-align: left;&quot;&gt;&lt;span&gt;$E(\theta, \text{Pe}, x) = E_{\text{PFR}}(\theta, \text{Pe}, x) * E_{\text{CSTR}}(\theta, &amp;nbsp;x)$&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;color: #333333; text-align: left;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;$F(\theta, \text{Pe}, x)=\int_0^\theta E(\theta ', \text{Pe}, x) d \theta '$&lt;/p&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;($\theta$는 무차원 시간, $x$는 residence time fraction, $F$는 누적 분포 함수)&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;위 식을 이용하여 &lt;b&gt;최소 체류 시간&lt;/b&gt; $\tau_{\text{min}}$을 $\eta$에 대해 정의할 수 있음. $\eta$는 시간 $\tau_{\text{min}}$ 이전에 칼럼을 나가는 물질의 분율을 의미하며, 대개 $10^{-5}$와 $0.005$ 사이의 값을 가지는 것으로 알려져 있음.&lt;/li&gt;
&lt;/ul&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;$F(\theta(\tau_{\text{min}}, \tau), \text{Pe}, x)=\eta$&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;RTD는 반응기 설계, 파울링, input flow rate 등에 의해 영향을 받으므로 이러한 &lt;b&gt;변수들의 외란을 제거하는 것이 필요함&lt;/b&gt;. 특히 긴 시간 동안 운전되는 연속 공정에서는 외란 제거가 중요함. 그러나 RTD는 확률분포의 특성상 무한 차원의 데이터를 포함하므로, flow rate와 같은 스칼라 변수로는 완전한 제거가 불가능함. 따라서 RTD의 외란을 직접적으로 제거하기보다는 &lt;b&gt;RTD를 스칼라화한 측정값, 즉 LRV와 MRT의 교란을 제거&lt;/b&gt;하는 것이 보다 효율적인 대안임. 이러한 방법을 통해 체류시간의 외란 요소를 제거할 수 있음.&lt;/li&gt;
&lt;li&gt;MRT의 제어를 위해서는 RTD와 MRT를 적절히 측정해야하고, 측정된 MRT에 따라 flow rate를 조절하는 피드백 제어 알고리즘이 필요함. 그러나 &lt;b&gt;작은 $\eta$ 값에 대해서는 in-line 센서의 노이즈 때문에 tracer를 이용한 직접적 측정이 불가능함&lt;/b&gt;. 따라서&lt;b&gt; 베이즈 추정법을 통해 RTD 분포의 파라미터를 얻은 후 MRT를 결정하였음&lt;/b&gt;. Beer-Lambert law에 의해 280 nm에서의 시간에 따른 흡광도 $A_{280}(t)$는 농도 $C(t)$에 비례하며, 칼럼 입구와 출구의 흡광도는 확률분포함수와의 convolution을 통해 계산할 수 있음. 이후 베이즈 추정법을 적용하여 파라미터 $\tau$, $\text{Pe}$, $x$를 추정하였음. 추정된 파라미터는 누적분포함수 $F$에 대입되어 MRT를 계산할 수 있음.&lt;/li&gt;
&lt;/ul&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;$A_{280, \text{out}}(\theta) = A_{280, \text{in}}(\theta) * E(\theta, \text{Pe}, x)$&lt;/p&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;$\tau, \text{Pe}, x = \underset{\tau, \text{Pe}, x}{\text{argmax}} \int_0^T [A_{280, \text{in}}(\theta) * E(\theta, \text{Pe}, x) - A_{280, \text{out}}(\theta)]^2dt$&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;편향을 줄이고 정확한 매개변수 추정을 위해 &lt;/span&gt;&lt;b&gt;흡광도는 충분한 변동성&lt;/b&gt;을 갖고 있어야 함. 그러나 연속 공정과 같이 &lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;시스템의 변동성이 없다면 농도 변동을 의도적으로 도입해야 함. &lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;이러한 변동을 안전하게 도입하기 위해 &lt;b&gt;탈이온수(DIW)를 주기적으로 주입&lt;/b&gt;하였음. DIW는 단백질이 포함된 input solution에 비해 UV 흡광도가 거의 없기 때문에 용액을 효과적으로 희석시켜 흡광도를 낮추고 필요한 변동성을 제공할 수 있음.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;이렇게 계산된 &lt;b&gt;MRT는 peristaltic pump의 회전 속도를 피드백 제어&lt;/b&gt;하는 데에 사용됨. Peristaltic pump는 회전 속도와 유량이 거의 정비례하므로 setpoint에서의 MRT와 현재 시간에서의 &lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;MRT를 비교하여 다음 time step에서의 &lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;$\omega_{i+1}$을 추정할 수 있음. MRT가 낮아지면 유량을 낮춰야 하고, 반대로 MRT가 높아지면 유량을 늘려야 하므로 아래와 같은 피드백 제어 식을 유도할 수 있음.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;$\omega_{i+1} = \frac{\omega_i \tau_{\text{min}, i}}{\tau_{\text{min, SP}}}$&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;그러나 &lt;b&gt;유량 자체만을 조절하여 체류 시간을 제어하는 방법&lt;/b&gt;은 inlet flow rate&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;의 변화에 반응하는 자유도가 하나 줄어드는 효과가 있으므로 &lt;b&gt;inlet flow rate&lt;/b&gt;&lt;/span&gt;&lt;b&gt;의 갑작스러운 변화가 발생하는 상황에 대응하는 데에 취약&lt;/b&gt;함. 따라서 &lt;b&gt;product stream을 희석하거나 surge tank를 도입&lt;/b&gt;하여 inlet flow rate의 급격한 변동&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;을 완충하여 문제를 완화할 수 있음.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;color: #000000;&quot;&gt;&lt;span style=&quot;caret-color: #000000;&quot;&gt;실험 결과 $\text{pH}$ 제어기가 setpoint를 빠르고 정확하게 따라갔으며, RTD 모델 또한 MRT와 칼럼 전후 용액 흡광도를 잘 예측하였음. Buffer의 변경에 의한 갑작스러운 &lt;span style=&quot;color: #000000; text-align: left;&quot;&gt;$\text{pK}_a$의 변화는 베이즈 추정법에 의해 빠르게 업데이트되었으며, 칼럼 길이 변경에 의한 RTD 및 MRT의 외란 또한 펌프 회전 속도 제어에 의해 정확하게 복구되었음.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;color: #000000;&quot;&gt;&lt;span style=&quot;caret-color: #000000;&quot;&gt;&lt;span style=&quot;color: #000000; text-align: left;&quot;&gt;Viral inactivation 및 LRV는 &lt;span style=&quot;color: #000000; text-align: left;&quot;&gt;$\text{pH}$(3.5 ~ 4.8)&lt;/span&gt;와 MRT(2.5분 ~ 20분)를 각각 변화시켰을 때에 측정되었음. MRT = 300 sec로 일정할 때&lt;span style=&quot;color: #000000; text-align: left;&quot;&gt;&amp;nbsp;LRV는 first mixing unit에서의 &lt;span style=&quot;color: #000000; text-align: left;&quot;&gt;$\text{pH}$가 감소할수록 증가하였고, &lt;span style=&quot;color: #000000; text-align: left;&quot;&gt;$\text{pH} = 4.5$일 때 LRV는 MRT가 증가할수록 같이 증가하였음. 이러한 LRV 측정 결과는 연속공정의 운전 시간이 달라져도 일정한 값으로 재현이 잘 되었음.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size23&quot;&gt;&lt;span style=&quot;font-family: GungSeo, serif;&quot;&gt;&lt;b&gt;4. Conclusion&lt;/b&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;제어기의 적절한 설계 후 검증 실험을 통해 &lt;span style=&quot;color: #000000; text-align: left;&quot;&gt;$\text{pH}$와 MRT의 tight control이 가능함을 확인하였음. 또한 베이즈 추정법을 통해 추정한 &lt;span style=&quot;color: #000000; text-align: left;&quot;&gt;$\text{pK}_a$와 RTD가 정확함을 확인하였고, 이는 갑작스러운 외란에도 &lt;span style=&quot;color: #000000; text-align: left;&quot;&gt;$\text{pH}$와 MRT의 &lt;/span&gt;빠른 제어를 가능케 하였음.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;color: #000000;&quot;&gt;&lt;span style=&quot;caret-color: #000000;&quot;&gt;위와 같은 제어 시스템을 이용하여 연속 공정 결과 얻어진 용액의 viral inactivation과&amp;nbsp;&lt;span style=&quot;color: #000000; text-align: left;&quot;&gt;$\text{pH}$ 및 MRT 사이의 상관관계를 확인할 수 있었음.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;color: #000000;&quot;&gt;&lt;span style=&quot;caret-color: #000000;&quot;&gt;이러한 중요공정변수의 엄격한 제어는 과도한 조절이 불필요하게 발생하는 것을 방지하고 연속공정의 제품 품질과 생산성, 그리고 안정성을 개선할 수 있도록 기여함.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <category>논문 리딩</category>
      <author>표군</author>
      <guid isPermaLink="true">https://pyonyo.tistory.com/3</guid>
      <comments>https://pyonyo.tistory.com/3#entry3comment</comments>
      <pubDate>Wed, 11 Dec 2024 18:41:52 +0900</pubDate>
    </item>
    <item>
      <title>[딥러닝] Adam: A Method for Stochastic Optimization</title>
      <link>https://pyonyo.tistory.com/2</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;font-family: 'Noto Serif KR';&quot;&gt;&lt;b&gt;저자 : Diederik P. Kingma, Jimmy Lei Ba&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;font-family: 'Noto Serif KR';&quot;&gt;&lt;b&gt;링크 : &lt;a href=&quot;https://arxiv.org/abs/1412.6980&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://arxiv.org/abs/1412.6980&lt;/a&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;figure id=&quot;og_1732090914517&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;Adam: A Method for Stochastic Optimization&quot; data-og-description=&quot;We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory r&quot; data-og-host=&quot;arxiv.org&quot; data-og-source-url=&quot;https://arxiv.org/abs/1412.6980&quot; data-og-url=&quot;https://arxiv.org/abs/1412.6980v9&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/dZVgV9/hyXzVG7IC0/6wr70gvBSGKJW7cN6g4DU0/img.png?width=1200&amp;amp;height=700&amp;amp;face=0_0_1200_700,https://scrap.kakaocdn.net/dn/bbdfnk/hyXDcHcmQK/3gY0ZnDkJOI0xYhizov8S1/img.png?width=1000&amp;amp;height=1000&amp;amp;face=0_0_1000_1000&quot;&gt;&lt;a href=&quot;https://arxiv.org/abs/1412.6980&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://arxiv.org/abs/1412.6980&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/dZVgV9/hyXzVG7IC0/6wr70gvBSGKJW7cN6g4DU0/img.png?width=1200&amp;amp;height=700&amp;amp;face=0_0_1200_700,https://scrap.kakaocdn.net/dn/bbdfnk/hyXDcHcmQK/3gY0ZnDkJOI0xYhizov8S1/img.png?width=1000&amp;amp;height=1000&amp;amp;face=0_0_1000_1000');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Adam: A Method for Stochastic Optimization&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory r&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;arxiv.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000; text-align: start; font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;대학원 진학을 준비하면서 인공지능을 처음 접하게 되었고, 딥러닝 코드를 다루다 보니 정말 자주 보게 된 것이 바로 Adam인데요...! &lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start; font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;Adam의 기본적인 작동 원리는 알고 있었지만 왜 그렇게 많은 딥러닝 코드에서 Adam을 최적화 방법으로 사용하는지 궁금했습니다. 인공지능 강의에서도 '웬만하면 Adam을 쓰는 게 좋다'라고 하고 대충 넘어가기도 했고요. &lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start; font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;그래서 Adam을 처음 제안한 논문을 읽고, 이해를 돕기 위해 나름 제 언어로 쉽게 재구성해보았습니다.&lt;/span&gt;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;2000&quot; data-origin-height=&quot;2000&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cD2TTk/btsKSEFxS3A/IAjIZFF26w5d7GkECOdjJ0/img.jpg&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cD2TTk/btsKSEFxS3A/IAjIZFF26w5d7GkECOdjJ0/img.jpg&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cD2TTk/btsKSEFxS3A/IAjIZFF26w5d7GkECOdjJ0/img.jpg&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcD2TTk%2FbtsKSEFxS3A%2FIAjIZFF26w5d7GkECOdjJ0%2Fimg.jpg&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;387&quot; height=&quot;387&quot; data-origin-width=&quot;2000&quot; data-origin-height=&quot;2000&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;span style=&quot;font-family: GungSeo, serif;&quot;&gt;&lt;b&gt;1. Introduction&lt;/b&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;Random data subsampling이나&amp;nbsp;noise, dropout&amp;nbsp;등에&amp;nbsp;의해서 인공지능&amp;nbsp;머신의&amp;nbsp;&lt;b&gt;목적함수는&amp;nbsp;확률적(stochastic)&lt;/b&gt;일&amp;nbsp;때가&amp;nbsp;많음.&amp;nbsp;따라서&amp;nbsp;고차원의 파라미터&amp;nbsp;공간&amp;nbsp;상에서&amp;nbsp;확률적&amp;nbsp;목적함수를&amp;nbsp;효율적으로&amp;nbsp;최적화하는&amp;nbsp;기법이&amp;nbsp;중요함.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;고차 도함수(ex. Newton method)를 사용하는 최적화 기법의 경우 텐서 연산의 시간 복잡도가 차수에 지수적으로 증가하므로 매우 비효율적임. 또한 비볼록(non-convex) 최적화 문제에서는 알고리즘 성능이 좋지 않음. 따라서 &lt;b&gt;1차 도함수 기반 최적화 기법&lt;/b&gt;에 집중하였음.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1440&quot; data-origin-height=&quot;668&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bsXrZS/btsKRk59U6A/eHO99glH02y9DgTkAMIlYK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bsXrZS/btsKRk59U6A/eHO99glH02y9DgTkAMIlYK/img.png&quot; data-alt=&quot;경사하강법 vs 뉴턴법&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bsXrZS/btsKRk59U6A/eHO99glH02y9DgTkAMIlYK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbsXrZS%2FbtsKRk59U6A%2FeHO99glH02y9DgTkAMIlYK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;504&quot; height=&quot;234&quot; data-origin-width=&quot;1440&quot; data-origin-height=&quot;668&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;경사하강법 vs 뉴턴법&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;Adam은&amp;nbsp;1차&amp;nbsp;도함수&amp;nbsp;및&amp;nbsp;1차&amp;bull;2차&amp;nbsp;모멘트만을&amp;nbsp;사용하여 &lt;b&gt;각&amp;nbsp;파라미터의&amp;nbsp;learning rate을&amp;nbsp;adaptive하게&amp;nbsp;조정&lt;/b&gt;함.&amp;nbsp;간단한&amp;nbsp;알고리즘&amp;nbsp;구조로&amp;nbsp;인해&amp;nbsp;&lt;b&gt;메모리&amp;nbsp;사용량이&amp;nbsp;적음&lt;/b&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;Adam은 기울기의 합을 누적하여 sparse gradient에 적합한 &lt;b&gt;AdaGrad&lt;/b&gt;와, 현재 기울기에 가중치를 부여하여 on-line (실시간) 및 non-stationary(시간에 따라 통계적 특성이 변화하는 비정상성)에 적합한 &lt;b&gt;RMSProp&lt;/b&gt;의 장점을 통합함.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;span style=&quot;font-family: GungSeo, serif;&quot;&gt;&lt;b&gt;2. Algorithm&lt;/b&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1012&quot; data-origin-height=&quot;640&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bW3fiQ/btsKPgxCGKa/WK0peB1OPksIcoRk3YMakK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bW3fiQ/btsKPgxCGKa/WK0peB1OPksIcoRk3YMakK/img.png&quot; data-alt=&quot;Adam Algorithm&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bW3fiQ/btsKPgxCGKa/WK0peB1OPksIcoRk3YMakK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbW3fiQ%2FbtsKPgxCGKa%2FWK0peB1OPksIcoRk3YMakK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;464&quot; height=&quot;293&quot; data-origin-width=&quot;1012&quot; data-origin-height=&quot;640&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Adam Algorithm&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;$g_t$는 시간 $t$에서 목적함수의 기울기(그라디언트)를 의미함.&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;$m_t$는 $g_t$의 지수이동평균을, $v_t$는 $g_t^2$의 지수이동평균을 의미함.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&lt;b&gt;지수이동평균&lt;/b&gt;(exponential moving average, EMA)이란 &lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;과거의&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;모든&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;기간을&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;계산 대상으로&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;하며&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;최근의&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;데이터에&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;더&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;높은&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;가중치&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;과거로&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;갈수록&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;지수적으로&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;감소하도록&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;설정&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;)&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;를&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;두는&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;가중이동평균법&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;.&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify; letter-spacing: 0px;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;622&quot; data-origin-height=&quot;534&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/wpE5Q/btsKQv2BVKm/rNLvyu4uefjEz1TpuZfxjk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/wpE5Q/btsKQv2BVKm/rNLvyu4uefjEz1TpuZfxjk/img.png&quot; data-alt=&quot;Adam Optimizer 지수이동평균의 점화식을 일반항으로 표현한 식.&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/wpE5Q/btsKQv2BVKm/rNLvyu4uefjEz1TpuZfxjk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FwpE5Q%2FbtsKQv2BVKm%2FrNLvyu4uefjEz1TpuZfxjk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;364&quot; height=&quot;313&quot; data-origin-width=&quot;622&quot; data-origin-height=&quot;534&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Adam Optimizer 지수이동평균의 점화식을 일반항으로 표현한 식.&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;Adam&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;은&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;update stepsize&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;을&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;신중하게&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;선택하는&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;알고리즘임&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&lt;b&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;Sparse gradient가&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;극단적인&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;/b&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&lt;b&gt;경우&lt;/b&gt;, &lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;과거의&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;모든&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;시점에서&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;gradient&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;가&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;0&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;이므로&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt; 현재 시점에서의 &lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;업데이트가&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt; 느리고&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;비효율적임&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;.&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;따라서&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;b&gt;stepsize&lt;/b&gt;&lt;/span&gt;&lt;b&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;를&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;크게&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;조정해줄&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;/b&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&lt;b&gt;필요&lt;/b&gt;가&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;있고&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;, 이는 learning rate($\alpha$)&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;에&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;1&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;보다&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;큰&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;상수 $(1-\beta_1) / \sqrt{1-\beta_2}$를&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;곱해줌으로서&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;보정할 수 있음.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;letter-spacing: 0px; font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;본 논문에서 sparse gradient의 경우에 $(1-\beta_1) &amp;gt; \sqrt{1-\beta_2}$의 부등식이 성립한다고 한 이유가 뭘까?&lt;br /&gt;Sparse gradient의 경우에서는 stepsize $|\Delta_t| = \left\vert \alpha \frac{\hat{m}_t}{\sqrt{\hat{v}_t}} \right\vert$를 키워주기 위해 분자에 위치한 $m_t$는 크게, 분모에 위치한 $v_t$는 작게 보정해야 함. 이를 위해서는 $m_t$를 계산할 때에는 과거의 데이터를 적게, $v_t$를 계산할 때에는 과거의 데이터를 많이 고려해야 함. 왜냐하면 과거의 그라디언트는 대부분 0이기 때문에 이를 모두 고려하면 지수이동평균의 절댓값은 작아지게 되고, 반대로 적게 고려하면 절댓값은 커지게 됨. 따라서 $\beta_1$는 줄여야하고 $\beta_2$는 키워야 효과적으로 stepsize를 크게 보정할 수 있음.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;이외의 일반적인 경우에는 $(1-\beta_1) \leq \sqrt{1-\beta_2}$이므로 $|\Delta_t| \leq \alpha$가 성립함. &lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;따라서&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;b&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;모델&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;파라미터에&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;대한&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;사전&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;분포를&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;알고&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;있는&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;경우&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;learning rate($\alpha$)&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;의&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;scale&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;을&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;적절하게&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;/b&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&lt;b&gt;추정&lt;/b&gt;할&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;수&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;있음&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;Learning rate&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;($\alpha$)&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;에&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;곱해진&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;상수 $\hat{m}_t / \sqrt{\hat{v}_t}$는 마치 SNR처럼 해석할 수 있음. &lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;이&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;SNR&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;은&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;최적점에&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;도달할수록&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;0&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;에&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;접근하므로 &lt;/span&gt;&lt;b&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;알고리즘&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;내재적으로&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;gradient annealing&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;을&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;/b&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&lt;b&gt;구현&lt;/b&gt;할&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;수&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;있음.&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;color: #000000; font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&lt;span style=&quot;caret-color: #000000;&quot;&gt;목적함수의 그라디언트 $g_t$가 상수배가 되어도 &lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;$\hat{m}_t$와 $\sqrt{\hat{v}_t}$ 모두 동일한 만큼 scaling되므로 stepsize $|\Delta_t|$에 영향을 미치지 않음. 즉, &lt;b&gt;stepsize는 그라디언트의 scale에 불변(invariant)함.&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size23&quot;&gt;&lt;span style=&quot;font-family: GungSeo, serif;&quot;&gt;&lt;b&gt;3&lt;b&gt;. Initialization Bias Correction&lt;/b&gt;&lt;/b&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;$m_0&amp;nbsp;= v_0&amp;nbsp;= 0$이므로&amp;nbsp;&lt;b&gt;모멘트의&amp;nbsp;추정값은&amp;nbsp;0 근처에&amp;nbsp;biased됨&lt;/b&gt;.&amp;nbsp;이는 머신의 학습을 방해할 수 있기 때문에 모멘트 값을 보정하여&amp;nbsp;효율적인&amp;nbsp;학습을&amp;nbsp;유도함.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;Batchwise의 학습 방식은 $g_t^2$의 실제 기댓값을 정확하게 계산할 수 없기 때문에 $v_t$의 기댓값으로부터 추정해야 함(메모리 효율성 등의 이유도 있을 것임). 이 때 아래와 같은 관계가 성립함. $\zeta$는 $g_i^2$이 시간에 따라 독립적이라고 가정한 결과 발생한 오차를 나타내는 항임.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;952&quot; data-origin-height=&quot;364&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/c8TWN8/btsKVaJdWA6/7cUVsLgp3YNk8P8YPdJAe0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/c8TWN8/btsKVaJdWA6/7cUVsLgp3YNk8P8YPdJAe0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/c8TWN8/btsKVaJdWA6/7cUVsLgp3YNk8P8YPdJAe0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fc8TWN8%2FbtsKVaJdWA6%2F7cUVsLgp3YNk8P8YPdJAe0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;421&quot; height=&quot;161&quot; data-origin-width=&quot;952&quot; data-origin-height=&quot;364&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;$\beta$의 값을 적절히 선택하여 $\zeta$가 0에 근접하도록 할 수 있음.&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;그러면 &lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;$\mathbb{E}[g_t^2] = \frac{\mathbb{E}[v_t]}{(1-\beta_2^t)}$가 성립하므로 위와 같이 &lt;b&gt;bias-correction term&lt;/b&gt;을 도입할 수 있음.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style=&quot;color: #000000;&quot; data-ke-size=&quot;size23&quot;&gt;&lt;span style=&quot;font-family: GungSeo, serif;&quot;&gt;&lt;b&gt;4. Convergence Analysis&lt;/b&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&lt;b&gt;Regret&lt;/b&gt;이란 &lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;결정&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;이론&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;(decision theorem)&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;및&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;후회&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;이론&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;(regret theorem)&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;에&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;기반한&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;개념으로&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;,&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;모든&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;선택의&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;순간에서&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;최선의&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;선택과&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;실제&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;선택의&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;차이에&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;의한&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;손실의&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;합으로&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;정의됨&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;. &lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;Reinforcement learning&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;과&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;같은&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;stochastic learning model&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;에서&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;성능을&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;평가하는&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;주요&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;지표로&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;자주&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;활용됨&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: justify;&quot;&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;310&quot; data-origin-height=&quot;78&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bvnoyz/btsKUA2t0Sm/ETUcdDmkB8kkHyXX3qKhL1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bvnoyz/btsKUA2t0Sm/ETUcdDmkB8kkHyXX3qKhL1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bvnoyz/btsKUA2t0Sm/ETUcdDmkB8kkHyXX3qKhL1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbvnoyz%2FbtsKUA2t0Sm%2FETUcdDmkB8kkHyXX3qKhL1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;231&quot; height=&quot;58&quot; data-origin-width=&quot;310&quot; data-origin-height=&quot;78&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;Adam algorithm에서&amp;nbsp;&lt;b&gt;regret은&amp;nbsp;일정&amp;nbsp;상한&amp;nbsp;이하로&amp;nbsp;유계&lt;/b&gt;이며, regret bound는&amp;nbsp; ($\sqrt{T}$)의&amp;nbsp;복잡도를&amp;nbsp;가짐.&amp;nbsp;따라서&amp;nbsp;시간에&amp;nbsp;따른&amp;nbsp;&lt;b&gt;평균&amp;nbsp;regret $\frac{R(T)}{T}$는0에&amp;nbsp;수렴함&lt;/b&gt;.&amp;nbsp;이는&amp;nbsp;장기적으로&amp;nbsp;알고리즘이&amp;nbsp;최적의&amp;nbsp;결정을&amp;nbsp;내림을&amp;nbsp;보장함.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;학습 후반부에서 파라미터들이 &lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;최적점에&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;접근함에&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;따라&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;그라디언트의&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;변동성이&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;감소하므로&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt; 학습의 효율성을 위해서는 &lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;과거&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;그라디언트&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;정보의&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;의존도를&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;줄이는 것이 좋음. 따라서 &lt;span style=&quot;color: #333333; text-align: left;&quot;&gt;&lt;b&gt;학습 후반부에서 $\beta$를 감소시키면 좋음&lt;/b&gt;. 본 논문에서는 시간에 따라서 지수적으로 $\beta$를 감소하는 방법($\beta_{1,t} = \beta_1 \lambda^{t-1}$)을 제시함.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;Data&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;가&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;sparse&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;한&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;경우&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;그라디언트가&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;0&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;인&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;파라미터에&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;의해&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt; regret의 &lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;상한&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;값은&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;더욱&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;감소함&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;.&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;따라서&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;b&gt;Adam&lt;/b&gt;&lt;/span&gt;&lt;b&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;은&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;sparse&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;한 &lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;문제에도 &lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;적합&lt;/span&gt;&lt;/b&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;한&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;알고리즘임&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style=&quot;color: #000000; text-align: justify;&quot; data-ke-size=&quot;size23&quot;&gt;&lt;span style=&quot;font-family: GungSeo, serif;&quot;&gt;&lt;b&gt;5. Related Work&lt;/b&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;Adam은 RMSProp과 Momentum의 장점을 통합한 알고리즘으로 생각할 수 있지만, &lt;b&gt;단순히 둘을 병합한 것과는 차이가 있음.&lt;/b&gt; RMSProp + Momentum은 rescaled gradient($G_{t,j}$)의 모멘텀($V_{t,j}$)을 이용하여 파라미터를 업데이트하지만, Adam은 1차 및 2차 모멘텀의 이동평균을 보정해 파라미터를 직접적으로 업데이트함.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;AdaGrad는 Adam의 특수한 경우로 $\beta_1=0, \beta_2 \approx{1}$인 Adam algorithm에 $\alpha_t = \frac{\alpha}{\sqrt{t}}$의 learning rate schedule를 적용한 것과 동일함.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;904&quot; data-origin-height=&quot;554&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/ciMVxT/btsKTpOwYhz/AwMGfjhhFI0i0mfqkPjwFK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/ciMVxT/btsKTpOwYhz/AwMGfjhhFI0i0mfqkPjwFK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/ciMVxT/btsKTpOwYhz/AwMGfjhhFI0i0mfqkPjwFK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FciMVxT%2FbtsKTpOwYhz%2FAwMGfjhhFI0i0mfqkPjwFK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;490&quot; height=&quot;300&quot; data-origin-width=&quot;904&quot; data-origin-height=&quot;554&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 style=&quot;color: #000000; text-align: justify;&quot; data-ke-size=&quot;size23&quot;&gt;&lt;span style=&quot;font-family: GungSeo, serif;&quot;&gt;&lt;b&gt;6. Experiments&lt;/b&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;MNIST dataset: 1/&amp;radic;t decay를&amp;nbsp;적용한&amp;nbsp;AdaGrad, SGD와&amp;nbsp;Adam의&amp;nbsp;로지스틱&amp;nbsp;회귀&amp;nbsp;모델&amp;nbsp;성능&amp;nbsp;비교.&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&amp;rarr;&amp;nbsp;Adam은&amp;nbsp;AdaGrad보다&amp;nbsp;빠르게, SGD와&amp;nbsp;유사하게&amp;nbsp;수렴함.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;IMDB BoW dataset: Dropout과&amp;nbsp;전처리를&amp;nbsp;적용한&amp;nbsp;sparse dataset에&amp;nbsp;대하여&amp;nbsp;AdaGrad, RMSProp, SGD, Adam의&amp;nbsp;성능을&amp;nbsp;비교.&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&amp;rarr;&amp;nbsp;Adam은&amp;nbsp;sparse features에&amp;nbsp;대해&amp;nbsp;좋은&amp;nbsp;성능을&amp;nbsp;보였고,&amp;nbsp;특히&amp;nbsp;SGD보다&amp;nbsp;크게&amp;nbsp;개선된&amp;nbsp;수렴&amp;nbsp;속도를&amp;nbsp;보였음.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;Multi-layer NN (+ Dropout)&amp;nbsp;모델&amp;nbsp;비교&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&amp;rarr;&amp;nbsp;&lt;b&gt;Non-convex&amp;nbsp;최적화&amp;nbsp;문제임에도&amp;nbsp;불구&lt;/b&gt;하고&amp;nbsp;Adam이&amp;nbsp;가장&amp;nbsp;빠르게&amp;nbsp;수렴하였음.&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;(SFO method: 전체 목적 함수를 미니배치 단위의 부분 함수들의 합으로 분해하여 stochastic과 quasi-Newton method의 장점을 결합)&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;CNN (+ Dropout)&amp;nbsp;모델&amp;nbsp;비교&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&amp;rarr;&amp;nbsp;Dropout을&amp;nbsp;적용한&amp;nbsp;모델과&amp;nbsp;적용하지&amp;nbsp;않은&amp;nbsp;모델끼리&amp;nbsp;묶어서&amp;nbsp;비교했을&amp;nbsp;때&amp;nbsp;둘&amp;nbsp;다&amp;nbsp;Adam이&amp;nbsp;가장&amp;nbsp;빨리&amp;nbsp;수렴.&lt;/span&gt;&lt;br /&gt;&lt;b&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;(다만 실제로는 SGDMomentum가 Adam보다 좋은 성능을 보일 때도 많음.)&lt;/span&gt;&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;Bias-Correction Term&amp;nbsp;분석&amp;nbsp;(RMSProp + Momentum과&amp;nbsp;Adam의&amp;nbsp;비교)&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&amp;rarr;&amp;nbsp;Bias-correction term이&amp;nbsp;없을&amp;nbsp;시 $\beta_2$가&amp;nbsp;1에&amp;nbsp;접근할수록&amp;nbsp;loss가&amp;nbsp;불안정해짐.&amp;nbsp;가장&amp;nbsp;안정한&amp;nbsp;경우는&amp;nbsp;bias-correction term과&amp;nbsp;함께 $\beta_2$가&amp;nbsp;1에&amp;nbsp;가까운경우임.&amp;nbsp;하이퍼-파라미터의&amp;nbsp;설정과&amp;nbsp;관계&amp;nbsp;없이&amp;nbsp;Adam은 $\beta_1&amp;nbsp;= 0$인&amp;nbsp;(RMSProp + Momentum)의&amp;nbsp;경우보다&amp;nbsp;우수한&amp;nbsp;robustness&amp;nbsp;및&amp;nbsp;최적화&amp;nbsp;성능을&amp;nbsp;보임.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;span style=&quot;font-family: GungSeo, serif;&quot;&gt;&lt;b&gt;7. AdaMax&lt;/b&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;Adam의&amp;nbsp;분모($L^2$&amp;nbsp;norm)를 $L^p$&amp;nbsp;norm으로&amp;nbsp;확장함. $p$가&amp;nbsp;무한히&amp;nbsp;커질&amp;nbsp;때&amp;nbsp;간단하고&amp;nbsp;안정한&amp;nbsp;알고리즘이&amp;nbsp;유도됨.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;636&quot; data-origin-height=&quot;344&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/mB7RT/btsKSIVybff/HJqcQP03x11p4NKcNBeHtK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/mB7RT/btsKSIVybff/HJqcQP03x11p4NKcNBeHtK/img.png&quot; data-alt=&quot;빨간색 부분이 기존의 Adam algorithm과 다른 부분.&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/mB7RT/btsKSIVybff/HJqcQP03x11p4NKcNBeHtK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FmB7RT%2FbtsKSIVybff%2FHJqcQP03x11p4NKcNBeHtK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;545&quot; height=&quot;295&quot; data-origin-width=&quot;636&quot; data-origin-height=&quot;344&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;빨간색 부분이 기존의 Adam algorithm과 다른 부분.&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;Adam과&amp;nbsp;비교했을&amp;nbsp;때&amp;nbsp;bias-correction term이&amp;nbsp;빠지고&amp;nbsp;2차&amp;nbsp;모멘텀의&amp;nbsp;정의가 $L^p$&amp;nbsp;norm의&amp;nbsp;재귀적&amp;nbsp;표현으로&amp;nbsp;수정되었음. Bias-correction term이&amp;nbsp;필요&amp;nbsp;없는&amp;nbsp;이유는 $L^p$&amp;nbsp;norm의&amp;nbsp;수학적&amp;nbsp;특성상&amp;nbsp;max&amp;nbsp;연산자에&amp;nbsp;의해&amp;nbsp;초기값&amp;nbsp;0의&amp;nbsp;영향 및 bias가&amp;nbsp;첫&amp;nbsp;번째&amp;nbsp;계산&amp;nbsp;이후&amp;nbsp;사라지기&amp;nbsp;때문임.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&lt;b&gt;Bias-correction이&amp;nbsp;필요&amp;nbsp;없고&lt;/b&gt;&amp;nbsp;update stepsize의&amp;nbsp;상한&amp;nbsp;추정이&amp;nbsp;간단해졌다는&amp;nbsp;장점이&amp;nbsp;있음.&amp;nbsp;또한&amp;nbsp;max&amp;nbsp;연산자의&amp;nbsp;특성상&amp;nbsp;sparse gradient나&amp;nbsp;high variance gradient에&amp;nbsp;의한&amp;nbsp;민감성을&amp;nbsp;평탄화할&amp;nbsp;수&amp;nbsp;있음.&amp;nbsp;그러나 max 연산자 때문에 분모가 계속 커지므로 vanishing gradient에&amp;nbsp;취약함.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;span style=&quot;font-family: GungSeo, serif;&quot;&gt;&lt;b&gt;8. Conclusion&lt;/b&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;&amp;bull;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;b&gt;Adam은 RMSProp과&amp;nbsp;AdaGrad의&amp;nbsp;장점을&amp;nbsp;통합하고, non-convex&amp;nbsp;최적화&amp;nbsp;문제에도&amp;nbsp;잘&amp;nbsp;적용되니까 좋다...!&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;</description>
      <category>논문 리딩</category>
      <author>표군</author>
      <guid isPermaLink="true">https://pyonyo.tistory.com/2</guid>
      <comments>https://pyonyo.tistory.com/2#entry2comment</comments>
      <pubDate>Fri, 22 Nov 2024 21:46:35 +0900</pubDate>
    </item>
  </channel>
</rss>