Jekyll2019-05-10T10:10:45+02:00https://gdurisi.github.io/feed.xmlGiuseppe DurisiPersonal websiteGiuseppe Durisidurisi@chalmers.seSurvey on short codes2019-03-14T00:00:00+01:002019-03-14T00:00:00+01:00https://gdurisi.github.io/news-post/2019/03/14/news-post-phy-com<p>What are the best performing coding schemes for the transmission of short packets?</p>
<p>Check out our <a href="https://arxiv.org/abs/1812.08562"><kbd>recent survey</kbd></a>, where we benchmark the performance of algebraic, polar, LDPC codes and more against modern bounds from nonasymptotic information theory.</p>
<p class="notice--success">M. C. Coskun, G. Durisi, T. Jerkovits, G. Liva, W. Ryan, B. Stein, and F. Steiner, “Efficient error- correcting codes in the short blocklength regime,” Physical Communication, Mar. 2019. [<a href="https://arxiv.org/abs/1812.08562"><kbd>arXiv</kbd></a>]</p>Giuseppe Durisidurisi@chalmers.seA survey of good coding schemes for the transmission of short packetsMy view on massive machine-type communications2018-12-13T00:00:00+01:002018-12-13T00:00:00+01:00https://gdurisi.github.io/blog/2018/12/13/interview-mMTC<p><img src="/images/connected.png" alt="image-left" class="align-center" /></p>
<p>I recently presented my view on massive machine-type communications (mMTC) in an interview for the <a href="http://cn.committees.comsoc.org/files/2018/12/NewsLetter_Dec_2018.pdf">Cognitive Networks Technical Committee Newsletter</a>.
An excerpt of this interview is reported below.
Do not hesitate to <a href="mailto:durisi@chalmers.se">get in touch</a> if you have questions/comments. Also, I have some <a href="/news-post/2018/11/19/vacancies/">openings</a> in this area.</p>
<hr />
<p><em>Question:</em></p>
<p>How do you define future massive machine-type communication (mMTC) networks, and what will be the business cases for such 5G networks?</p>
<p><em>Answer:</em></p>
<p>Machine-type communications (MTCs), i.e., machine-centric rather than human-centric communications, will form the backbone of the upcoming automated society. In 5G, they will come in two flavors:</p>
<ul>
<li>
<p>ultra-reliable, low-latency communications (URLLC), which will target wireless connections with stringent requirements on both latency and reliability;</p>
</li>
<li>
<p>massive MTC (mMTC), which will focus instead on providing connectivity to a large number of devices that transmit sporadically a low amount of traffic.</p>
</li>
</ul>
<p>I see mMTC as the key technology needed to scale up the internet of thing (IoT), from its current limited use in the consumer sector, to its general use by enterprises and public entities, including municipalities.</p>
<p>The country I live in, i.e., Sweden, has the objective to become best in the world in using the opportunities brought by digitalization. I believe that deploying mMTC solutions systematically at an enterprise and municipality level will be key to achieve this goal.
Once available, such IoT connectivity solutions will bring many advantages, which span from increasing revenues by enabling new business models, to improving effectivity and decreasing costs. Pilot projects are already ongoing in different parts of the world where IoT technology is used for smart lightening, waste disposal, asset tracking, process monitoring and optimization, etc…</p>
<p>Connectivity solutions for mMTC are currently under development both in the licensed part of the spectrum (LTE and 5G) and the unlicensed part (low-power wide-area network (LP-WAN) solutions such as SigFox and LoRaWAN). One critical challenge in developing general mMTC solutions is that MTC devices will be extremely heterogeneous in terms of computational capabilities, cost, energy consumption, and transmission power. It will probably be hard to develop a single cellular technology able to address such heterogeneity, and there will be business opportunities for alternative specialized technologies, especially in the LP-WAN area.</p>
<hr />
<p><em>Question:</em></p>
<p>Could you please explain the most pertinent mMTC techniques/scenarios for 5G?</p>
<p><em>Answer:</em></p>
<p>When I think of mMTC, I have in mind the following scenario: a wireless technology able to provide connectivity to a massive number of low-cost, low-complexity devices, which transmit sporadically small payloads. Supporting such a traffic typology requires a profound redesign of the radio-access network of current cellular systems, which are typically designed to maximize the throughput of only few active users.</p>
<p>For example, the presence of a massive number of sporadically active users calls for asynchronous random-access protocols. Small payloads cause finite-blocklength effects that need to be accounted for to obtain accurate performance estimates. Furthermore, latency and energy efficiency considerations call for a redesign of control plane, aimed at minimizing protocol overheads. Finally, novel security and privacy protocols need to be developed, especially in the practically relevant use case in which low-complexity mMTC devices are connected to a powerful, but not necessarily trustworthy cloud server.</p>
<hr />
<p><em>Question:</em></p>
<p>Could you please elaborate on the main role of mMTC in key vertical industries? How does the telecoms sector engages with them during the standardization phase? How does it actually build these state-of-the-art networks? Is 2020 a realistic deadline?</p>
<p><em>Answer:</em></p>
<p>Possible scenarios for mMTCs include structure and environmental monitoring, asset tracking, and process monitoring and optimization. The volume necessary for the success of such technology will come only if mMTCs will be used extensively in many of the services offered by municipalities. Examples may include maintenance of road signs and lightening, as well as smart waste disposal.</p>
<p>One challenge is that mMTCs as currently defined in 5G appear too heterogeneous for a single technology to cover all use cases. And this is true even if we focus on a single vertical scenario, say factory automation, where different MTC-enabled applications come with completely different throughput, latency, and reliability requirements. Compare for example the requirements on a wireless link used to control an actuator with the requirements on a link allowing an operator to access sensor data on a mobile device for monitoring purposes.</p>
<p>Telecoms are heavily involved in pilot projects in different vertical industries. For example, Chalmers is part of a successful project in the area of <a href="https://research.chalmers.se/en/project/8452">5G-enabled manufacturing</a> together with Ericsson, SKF, Volvo, and Siemens in which demonstrators for manufacturing system design, deployment, operation and maintenance are under development.</p>
<p>The 2020 deadline for 5G standardization seems very ambitious as far as both mMTC and URLLC are concerned. I am sure that these use cases will be at the center of several subsequent releases after 2020.</p>
<hr />
<p><em>Question:</em></p>
<p>Could you please briefly introduce the most recent research project that you have done in mMTC?</p>
<p><em>Answer:</em></p>
<p>Over the last 6 years I have been working extensively together with many collaborators on the problem of developing communication-theoretic tools for the design of the physical layer of short-packet transmission systems. This <a href="https://github.com/yp-mit/spectre">toolbox</a> is now fairly mature and ready to be used as ingredient for complex system designs beyond the physical layer.</p>
<p>In this respect, I have recently started two projects whose focus is precisely mMTC design. The first project, which is conducted together with two Swedish companies, <a href="https://www.qamcom.se/">QAMCOM</a> and <a href="https://blink.services/">Blink Services</a>, and is sponsored by the <a href="https://strategiska.se/en/about-ssf/">Swedish Foundation for Strategic Research</a>, aims at investigating novel LP-WAN designs to support mMTC. One specific objective of the project is to optimally design a virtual network connecting LP-WAN owned by different service providers, to enable the deployment of services on a national scale.</p>
<p>The second project, which is supported by the <a href="http://wasp-sweden.org">Swedish Wallenberg AI Autonomous Systems and Software Program</a>, deals with security and privacy in mMTC. Specifically, the project has the goal of designing low-latency, massive multi-client, verifiable delegation-of-computation protocols that provide the highest possible level of security and privacy compatible with the computational resources available at the mMTC devices.</p>
<hr />
<p><em>Question:</em></p>
<p>Beyond your own work, are there any resources that you would like to recommend, specially to those who are new in the field and want to learn more about mMTC? What are three key papers in the area?</p>
<p><em>Answer:</em></p>
<p>There are many excellent works on the design of mMTCs. I would like to highlight the following three works, which have heavily influenced my view on this subject:</p>
<ul>
<li>C. Bockelmann, et al., “Towards Massive Connectivity Support for Scalable mMTC Communications in 5G Networks,” IEEE Access, vol. 6, pp. 28 969–28 992, 2018.</li>
</ul>
<p>This paper offers a comprehensive review of recent research activities on the design on mMTC for 5G, performed in the context of the Horizon2020 European Project FANTASTIC-5G.</p>
<ul>
<li>Y. Polyanskiy, “A perspective on massive random access,” in IEEE Int. Symp. Inf. Theory (ISIT), Jan. 2017.</li>
</ul>
<p>This article offers a novel framework for the information-theoretic analysis of random-access protocols. One take-home message of this article is that the currently available random-access solutions are highly suboptimal in terms of energy efficiency.</p>
<ul>
<li>M. Berioli, G. Cocco, G. Liva, and A. Munari, Modern random access protocols, ser. Foundations and Trends in Networking. now Publishers, 2016.</li>
</ul>
<p>A large body of work on random access protocols has been developed over the last decade in the context of satellite communications. This book reviews the most promising strategies—a sure source of inspiration for the design of novel mMTC protocols.</p>
<hr />
<p><em>Question:</em></p>
<p>What are the most important open problems and future research directions towards mMTC?</p>
<p><em>Answer:</em></p>
<p>The work by Polyanskiy (2017) has illustrated that classic random-access protocols, which were originally developed to maximize throughput, perform poorly when analyzed through the lens of energy efficiency. Many researchers have recently developed alternative coding schemes, which approach the fundamental limits unveiled by Polyanskiy (2017). However, most of these analyses are performed on simplified channel models. Assessing whether these alternative solutions are robust against the impairments typically encountered in mMTCs (asynchronism, lack of channel knowledge, phase noise) is an important research direction.</p>
<p>Next-generation 5G networks will need to support other traffic types than just mMTC. How to guarantee optimal coexistence between drastically different service types is a crucial research question. As for the case of fading in a multiuser environment, one should be able to exploit this heterogeneity in the protocol-design phase to improve performance, a concept we referred to as “reliability diversity” in a recent <a href="https://arxiv.org/abs/1804.05057">contribution</a>.</p>
<p>A final important research issue is privacy and security. Short payloads combined with latency constraints, and limited computational capabilities at the devices make it hard to adapt current cryptographic methods to mMTC. Recent nonasymptotic information-theory results have given us a precise way to characterize the tradeoff between reliability, energy efficiency, and throughput in the transmission of short payloads. However, very little is available on the characterization of the latency and energy overheads required to implement security and privacy protocols.</p>
<p>The massiveness and the sporadic nature of the MTC traffic pose unique novel challenges that the research community has started investigating only very recently. I believe that there are many opportunities to bring novel research ideas into practice, both in 5G and in LP-WAN.</p>
<p>To conclude, I would like to thank the following persons for the many interesting discussions on the topic of mMTC: Martin Edofsson, Johan Lassing, <a href="http://www.wirelesscoding.org">Gianluigi Liva</a>, <a href="http://people.lids.mit.edu/yp/homepage/">Yury Polyanskiy</a>, <a href="http://petarpopovski.es.aau.dk">Petar Popovski</a>, <a href="https://nms.kcl.ac.uk/osvaldo.simeone/index.htm">Osvaldo Simeone</a>, Kasper Fløe Trillingsgaard, and Andreas Wolfgang.</p>Giuseppe Durisidurisi@chalmers.seInterview on massive machine-type communications for the Cognitive Networks Technical Committee NewsletterTutorial on short-packet transmission at Globecom 20182018-12-12T00:00:00+01:002018-12-12T00:00:00+01:00https://gdurisi.github.io/news-post/2018/12/12/tutorial-glob<p><img src="/images/glob-tutorial.png" alt="image-left" class="align-center" />
<a href="http://www.wirelesscoding.org">Dr. Gianluigi Liva</a> (German Aerospace Center–DLR), <a href="https://www.lnt.ei.tum.de/en/people/doctoral-researchers/steiner/">Fabian Steiner</a> (Technical University of Munich, Germany), and I gave a tutorial on <em>Short-packet transmission: fundamentals and practical coding schemes</em> at <a href="http://globecom2018.ieee-globecom.org">Globecom 2018</a></p>
<p>In the tutorial, we review the fundamental tradeoff between throughput and reliability when transmitting short packets, using recently-developed tools in finite-blocklength information theory. We also describe state-of-the-art code constructions (involving binary/nonbinary LDPC and turbo codes, polar codes, and tailbiting convolutional codes) for the short-block regime, and compare their performance with nonasymptotic information-theoretic limits.</p>
<p>The latest version of the slides of our tutorial can be downloaded <a href="/files/2018/Globecom_Tutorial_2018.pdf">here</a>.</p>Giuseppe Durisidurisi@chalmers.seSlides of our Globecom tutorial---short-packets communications: fundamentals and practical coding schemesTwo postdoctoral and one PhD positions available in my team2018-11-19T00:00:00+01:002018-11-19T00:00:00+01:00https://gdurisi.github.io/news-post/2018/11/19/vacancies<p><img src="/images/AvancezChalmers.png" alt="image-left" class="align-left" /></p>
<p><strong>Update March 2019</strong>: I have an additional opening for one PhD student and one <a href="http://www.chalmers.se/en/about-chalmers/Working-at-Chalmers/Vacancies/Pages/default.aspx?rmpage=job&rmjob=7369&rmlang=UK">postdoctoral researcher</a>. Do get in touch if you are interested.</p>
<hr />
<p>I am looking for two postdoctoral researchers and one Phd student who are interested in joining my research team.</p>
<p>The first postdoctoral vacancy is in the area of <strong>secure, private, and low-latency wireless cloud connectivity for Internet-of-thing applications</strong>.</p>
<p>More informations about this position are available <a href="http://www.chalmers.se/en/about-chalmers/Working-at-Chalmers/Vacancies/Pages/default.aspx?rmpage=job&rmjob=6928">here</a>.</p>
<p><strong>Application deadline</strong>: January 5, 2019.</p>
<p>The second postdoctoral vacancy is in the area of <strong>information-theoretic methods in machine learning</strong>, with specific focus on deep neural networks.</p>
<p>More informations about this position are available <a href="http://www.chalmers.se/en/about-chalmers/Working-at-Chalmers/Vacancies/Pages/default.aspx?rmpage=job&rmjob=6918">here</a>.</p>
<p><strong>Application deadline</strong>: January 5, 2019.</p>
<p>The Phd student position is in the area of <strong>mission-critical machine-type wireless communications</strong>.</p>
<p>More informations about this position are available <a href="http://www.chalmers.se/en/about-chalmers/Working-at-Chalmers/Vacancies/Pages/default.aspx?rmpage=job&rmjob=p6932">here</a>.</p>
<p><strong>Application deadline</strong>: January 13, 2019.</p>
<p>Do not hesitate to <a href="mailto:durisi@chalmers.se">get in touch</a> if you need additional information.</p>Giuseppe Durisidurisi@chalmers.seI am looking for 2 postdoctoral researchers and 1 PhD student who would like to join my teamVR project approved2018-11-17T00:00:00+01:002018-11-17T00:00:00+01:00https://gdurisi.github.io/news-post/2018/11/17/VR<p><img src="/images/vetenskapsradet-logotyp.png" alt="image-left" class="align-left" />
The Swedish Research Council has approved our project ``<strong>Communication in time: theory and practice for the automated society</strong>’’.
The project is lead by my colleague Prof. <a href="https://www.chalmers.se/en/staff/Pages/erik-strom.aspx">Erik Ström</a> at the Communication System Group (Chalmers).
Prof. <a href="http://ceng.usc.edu/~ubli/ubli.html">Urbashi Mitra</a> at USC and myself are co-PIs.</p>
<p>The goal of the project is to acquire knowledge about fundamental limits and design principles for wireless communication systems with guarantees on timely delivery of data. We will focus on the important and challenging case of sporadic transmissions of short packets, as this is urgently needed for emerging applications in automated driving, factory automation, health care, and other critical applications.</p>
<p>A PhD student will be recruited within the project. The vacancy will be announced shortly</p>
<hr />Giuseppe Durisidurisi@chalmers.seA new research project in our team sponsored by VRWASP expedition project approved2018-11-17T00:00:00+01:002018-11-17T00:00:00+01:00https://gdurisi.github.io/news-post/2018/11/17/WASP-exp<p><img src="/images/logo_wasp_v2.png" alt="image-left" class="align-left" />
Wallenberg AI, Autonomous Systems and Software Program” (<a href="http://wasp-sweden.org">WASP</a>)— Sweden’s largest individual research program ever— is a program aimed at performing excellent research in artificial intelligence, autonomous systems, and software.
Our team has been granted a <strong>WASP expedition project</strong> entitled “<strong>Secure, Private, and Low-Latency Cloud Connectivity for IoT Applications</strong>”.</p>
<p>The WASP expedition project program is modeled on NSF’s Expeditions in Computing and targets high gain/ high risk targeted expeditions with a specific challenging goal.</p>
<p>Our expedition project, which will be performed jointly by my team and by the team of Prof. <a href="http://www.cse.chalmers.se/~aikmitr/Group.html">Aikaterini Mitrokotsa</a> at the Computer Science Department (Chalmers), aims at designing secure, privacy-preserving, and low-latency wireless connectivity solutions for cloud computing. We will target a multi-client scenario where a large number of sporadically active, resource-limited devices communicate to a powerful but untrusted cloud server. The design will rely on tools from nonasymptotic information theory, statistics, signal processing, cryptography, and differential privacy.</p>
<p>Two Postdoctoral researchers will be recruited within the project. The vacancy will be announced shortly</p>Giuseppe Durisidurisi@chalmers.seA new research project in our team sponsored by WASPInvited presentation at PIMRC2018-09-12T00:00:00+02:002018-09-12T00:00:00+02:00https://gdurisi.github.io/news-post/2018/09/12/PIMRC<p><img src="/images/logo_wasp_v2.png" alt="image-left" class="align-left" />
This week, I’ve attended <a href="http://pimrc2018.ieee-pimrc.org/">PIMRC</a> in Bologna, Italy, where I gave an invited presentation in the special session entitled ``Small-data networks’’ organized by my colleagues Enrico Paolini (University of Bologna, Italy),
Gianluigi Liva, (DLR, Germany) and Andrea Munari, RWTH Aachen University, Germany.</p>
<p>The intriguing idea of the session was to reflect on the challenges brought to communication systems by the need to support machine-type communications and their unique traffic characteristics: massive number of sporadically active users, short-packet transmissions, and—for some applications—low latency combined with high reliability requirements.</p>
<p>The slides of my presentation, entitled ``Fundamentals of short-packet transmission’’ can be downloaded <a href="https://www.slideshare.net/GiuseppeDurisi/18-09pirmc">here</a>.</p>
<p>The goal of my presentation was to illustrate in simple terms how finite-blocklength information theory can help us with the design of communication systems that transmit short packets and are subject to sometimes stringent latency and reliability constraints. I believe that finite-blocklength information theory is an indispensable toolbox for physical and network layer researchers working with IoT applications.</p>
<hr />Giuseppe Durisidurisi@chalmers.seFinite-blocklength information theory is an indispensable toolbox for the design of short-packet communication systems. In an invited presentation at this year [IEEE International Symposium on Personal, Indoor and Mobile Radio Communications](http://pimrc2018.ieee-pimrc.org/), I explained what one can do with this toolbox. The slides of my presentation can be downloaded [here](https://www.slideshare.net/GiuseppeDurisi/18-09pirmc)To appear in Trans. Inf. Theory2018-08-18T00:00:00+02:002018-08-18T00:00:00+02:00https://gdurisi.github.io/news-post/2018/08/18/news_post_it<p>The following two papers will soon appear in the IEEE Transactions on Information Theory:</p>
<hr />
<p>K. F. Trillingsgaard, W. Yang, G. Durisi, and P. Popovski, “Common-message broadcast channels with feedback in the nonasymptotic regime: Stop feedback,” <i>IEEE Tran. Inf. Theory</i>, 2018, to appear. [<a href="http://arxiv.org/abs/1607.03519" target="_blank">arXiv</a>].</p>
<p><strong>Abstract:</strong>
We investigate the maximum coding rate for a given average blocklength and error probability over a $K$-user discrete memoryless broadcast channel for the scenario where a common message is transmitted using variable-length stop-feedback codes. For the point-to-point case, Polyanskiy <em>et al</em>. (2011) demonstrated that variable-length coding combined with stop-feedback significantly increases the speed of convergence of the maximum coding rate to capacity. This speed-up manifests itself in the absence of a square-root penalty in the asymptotic expansion of the maximum coding rate for large blocklengths, i.e., zero dispersion. In this paper, we present nonasymptotic achievability and converse bounds on the maximum coding rate of the common-message $K$-user discrete memoryless broadcast channel, which strengthen and generalize the ones reported in Trillingsgaard <em>et al.</em> (2015) for the two-user case. An asymptotic analysis of these bounds reveals that zero dispersion cannot be achieved for certain common-message broadcast channels (e.g., the binary symmetric broadcast channel). Furthermore, we identify conditions under which our converse and achievability bounds are tight up to the second order. Through numerical evaluations, we illustrate that our second-order expansions approximate accurately the maximum coding rate and that the speed of convergence to capacity is indeed slower than for the point-to-point case.</p>
<hr />
<p>K. F. Trillingsgaard, W. Yang, G. Durisi, and P. Popovski, “Common-message broadcast channels with feedback in the nonasymptotic regime: Full feedback,” <i>IEEE Tran. Inf. Theory</i>, 2018, to appear [<a href="https://arxiv.org/abs/1706.07731" target="_blank">arXiv</a>].</p>
<p><strong>Abstract:</strong>
We investigate the maximum coding rate achievable on a two-user broadcast channel for the case where a common message is transmitted with feedback using either ﬁxed-blocklength codes or variable-length codes. For the ﬁxed-blocklength-code setup, we establish nonasymptotic converse and achievability bounds. An asymptotic analysis of these bounds reveals that feedback improves the second-order term compared to the no-feedback case. In particular, for a certain class of antisymmetric broadcast channels, we show that the dispersion is halved. For the variable-length-code setup, we demonstrate that the channel dispersion is zero.</p>
<hr />Giuseppe Durisidurisi@chalmers.seThe following two papers will soon appear in the IEEE Transactions on Information Theory:Our team has a new member2018-08-18T00:00:00+02:002018-08-18T00:00:00+02:00https://gdurisi.github.io/news-post/2018/08/18/team<p><a href="https://www.chalmers.se/en/staff/Pages/Fredrik-Hellstrom.aspx">Fredrik Hellström</a> has joined our team as Ph.D. student. The goal of his project is to shed lights on the remarkable generalization performance of deep neural networks. This will be done using mathematical tools from various fields including optimization and information theory. I’ll be involved in the project as supervisor, together with my colleague <a href="http://www.maths.lth.se/matematiklth/personal/fredrik/index.html">Fredrik Kahl</a>, head of the computer vision group at Chalmers, who will act as co-supervisor.</p>
<p>Fredrik Hellström holds a master degree in Physics from Gothenburg University. In his master thesis, he explored dark matter models and their possible interaction with standard particles. Welcome, Fredrik!</p>
<hr />Giuseppe Durisidurisi@chalmers.seFredrik Hellström has joined our team as new Ph.D. studentDesigning URLLC via nonasymptotic information theory–part 42018-07-10T00:00:00+02:002018-07-10T00:00:00+02:00https://gdurisi.github.io/posts/FBL-tutorial-4<p>In <a href="/posts/fbl-tutorial-3/">part 3</a> we learnt how to evaluate the achievability bound, i.e., the blue curve in the figure below.</p>
<p><img src="/images/biawgn_fig_possible.png" alt="Feasible and not feasible (k,n) pairs" /></p>
<p>This post is about computing the red curve, i.e., the converse bound.</p>
<p>Converse bounds in information theory express an impossibility condition. In our case, a converse result says that no $(k,n,\epsilon)$-code can be found for which the triplets $(k,n,\epsilon)$ violate a certain relation. Converse bounds give an upper bound on $k^\star(n,\epsilon)$ and hence on $R^\star(k,n)$. Equivalently, they give a lower bound on $\epsilon^\star(k,n)$.</p>
<p>I will review in this post a very general technique to obtain converse bounds in finite-blocklength information theory, which relies on the so-called <em>metaconverse theorem</em> <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>.</p>
<p>Before introducing the metaconverse theorem, we need to review some basic concepts in binary hypothesis testing.</p>
<p>Binary hypothesis testing deals with the following problem (see figure below). We observe a vector $X^n$ which may have been generated according to a distribution $P_{X^n}$ or according to an alternative distribution $Q_{X^n}$. Our task is to discover which of the two distributions generated $X^n$.</p>
<p>To do so, we design a test $Z$ whose input is the vector $X^n$ and whose output is a binary value ${0,1}$. We use the convention that $0$ indicates that the test chooses $P_X^{n}$ and $1$ indicates that the test chooses $Q_{X^n}$.</p>
<p><img src="/images/hypothesis-testing.jp2" alt="Binary hypothesis testing" /></p>
<p>We can think of the test $Z$ as a binary partition of the set of the vectors $X^n$.</p>
<p><img src="/images/neyman_pearson.png" alt="Binary hypothesis testing as partition" /></p>
<p>Actually, we will need the more general notion of <em>randomized test</em>, in which the test is a conditional distribution $P_{Z|X^n}$. Coarsely speaking, for some values of $X^n$, we may flip a coin to decide whether $Z=0$ or $Z=1$.</p>
<p>We are interested in the test that minimizes the probability of misclassification when $X^n$ is generated according to $Q_{X^n}$ given that the probability of correct classification under $P_{X^n} $ is above a given target. Mathematically, we define the probability of misclassification under $Q_{X^n}$ as</p>
<script type="math/tex; mode=display">Q_{X^n}[Z=0]=\sum_{x^n}Q_{X^n}(x^n)P_{Z | X^n}(0 | x^n).</script>
<p>To parse this expression, recall that $Z=0$ means that the test chooses $P_{X^n}$, which is the wrong distribution in this case.
Here, I assumed for simplicity that $X^n$ is a discrete random variable. For the continuous case, one needs to replace the sum with an integral. The probability of correct classification under $P_{X^n}$ is similarly given by</p>
<script type="math/tex; mode=display">P_{X^n}[Z=0]=\sum_{x^n}P_{X^n}(x^n)P_{Z | X^n}(0 | x^n).</script>
<p>We let $\beta_{\alpha}(P_{X^n},Q_{X^n})$ be the Neyman-Pearson beta function, which denotes the smallest possible misclassification probability under $Q_{X^n}$ when the probability of correct classification under $P_{X^n}$ is at least $\alpha$:</p>
<script type="math/tex; mode=display">\beta_{\alpha}(P_{X^n},Q_{X^n})=\inf_{P_{Z| X^n}: P_{X^n}[Z=0]\geq \alpha} Q_{X^n}[Z=0].</script>
<p>It follows from the Neyman-Pearson lemma (see e.g., p. 144 of the following <a href="http://people.lids.mit.edu/yp/homepage/data/itlectures_v5.pdf">lecture notes</a>) that $\beta_{\alpha}(P_{X^n},Q_{X^n})$ is achieved by the test that computes the log-likelihood $\log \frac{P_{X^n}(x^n)}{Q_{X^n}(x^n)}$ and sets $Z=0$ if the log-likelihood is above a threshold $\gamma$; it sets $Z=1$ otherwise. Here, $\gamma$ is chosen so that</p>
<script type="math/tex; mode=display">P_{X^n}\left[\log \frac{P_{X^n}(x^n)}{Q_{X^n}(x^n)} \geq \gamma \right]=\alpha.</script>
<p>Note that to satisfy this equality, one needs in general to use a randomized test that flips a suitably biased coin whenever $\log \frac{P_{X^n}(x^n)}{Q_{X^n}(x^n)} = \gamma$.</p>
<p>One may wonder at this point what binary hypothesis testing has to do with channel coding. Surely, decoding involves a hypothesis test. The test, however, is not binary but $M$-ary, since it involves $M=2^k$ codewords. We will see later on in this post why such a binary hypothesis test emerges naturally. I provide for the moment the following intuition.</p>
<p>Consider the binary hypothesis testing problem in which one has to decide whether the pair $(X^n, Y^n)$ comes from $P_{X^n,Y^n}$ or from $P_{X^n}P_{Y^n}$. Assume that all distributions are product distributions, i.e.,</p>
<script type="math/tex; mode=display">P_{X^n,Y^n}(x^n,y^n)=\prod_{j=1}^n P_{X,Y}(x_j,y_j)</script>
<p>then Stein’s lemma (see page 147 of <sup id="fnref:1:1"><a href="#fn:1" class="footnote">1</a></sup>) implies that for all $\alpha \in (0,1)$</p>
<script type="math/tex; mode=display">\beta_{\alpha}(P_{X^n,Y^n},P_{X^n}P_{Y^n}) = e^{-n\bigl[I(X;Y) + o(1)\bigr]},\quad n\to\infty.</script>
<p>Here, $o(1)$ stands for terms that go to zero when $n\rightarrow \infty$.</p>
<p>In words $\beta_{\alpha}(P_{X^n,Y^n},P_{X^n}P_{Y^n})$ decays exponentially fast in $n$, with an exponent given by the mutual information $I(X;Y)$. So if mutual information is useful in characterizing the performance of a communication system in the asymptotic limit $n\to\infty$, it makes sense that the Neyman-Pearson $\beta$ function is useful to describe performance at finite blocklength.</p>
<p>We are now ready to state our converse bound, which we shall refer to as metaconverse theorem although this term is used in <sup id="fnref:1:2"><a href="#fn:1" class="footnote">1</a></sup> for a slightly more general result. As for the RCUs bound in the <a href="/posts/fbl-tutorial-3/">previous post</a>, we will first state the bound for a general channel $P_{Y^n|X^n}$. Then, we will prove it. Finally, we will particularize it to the bi-AWGN channel. Here is the converse bound</p>
<p class="notice--success">Fix an arbitrary $Q_{Y^n}$. Every $(k,n,\epsilon)$–code must satisfy</p>
<script type="math/tex; mode=display">2^k\leq \sup_{P_{X^n}}\frac{1}{\beta_{1-\epsilon}(P_{X^n}P_{Y^n|X^n},P_{X^n}Q_{Y^n})}.\label{eq:converse}</script>
<p>Here is how to prove this theorem. Fix a $(k,n,\epsilon)$–code and let $P_{X^n}$ be the distribution induced by the encoder on the transmitted symbol vector $X^n$.
We now use our code to perform a binary-hypothesis test between $P_{X^n}P_{Y^n |X^n}$ and $P_{X^n}Q_{Y^n}$. Specifically, for a given pair $(X^n,Y^n)$ we determine the corresponding transmitted message $W$ by inverting the encoder given its output $X^n$. We also find the decoded message $\hat{W}$ by applying the decoder to $Y^n$.
If $W=\hat{W}$, we set $Z=0$ and choose $P_{X^n}P_{Y^n |X^n}$. Otherwise, we set $Z=1$ and choose $P_{X^n}Q_{Y^n}$.</p>
<p>Since the given code has error probability no larger than $\epsilon$ when used on the channel $P_{Y^n |X^n}$, we conclude that</p>
<script type="math/tex; mode=display">P_{X^n}P_{Y^n |X^n}[Z=0]\geq 1-\epsilon.</script>
<p>The misclassification probability under the alternative hypothesis $P_{X^n}Q_{Y^n}$ is the probability that the decoder produces a correct message when the channel output $Y^n\sim Q_{Y^n}$ is disconnected from its input $X^n$. This probability, which is equal to $1/2^k$ is no larger than the misclassification probability of the optimal test $\beta_{1-\epsilon}(P_{X^n}P_{Y^n|X^n},P_{X^n}Q_{Y^n})$. Mathematically,</p>
<script type="math/tex; mode=display">\beta_{1-\epsilon}(P_{X^n}P_{Y^n|X^n},P_{X^n}Q_{Y^n})\leq P_{X^n}Q_{Y^n}[Z=0]=\frac{1}{2^k}.</script>
<p>Hence, for the chosen $(k,n,\epsilon)$–code, we have</p>
<script type="math/tex; mode=display">2^k\leq \frac{1}{\beta_{1-\epsilon}(P_{X^n}P_{Y^n|X^n},P_{X^n}Q_{Y^n})}.\label{eq:metaconverse_given_code}</script>
<p>We finally maximize over $P_{X^n}$ to obtain a bound that holds for all $(k,n,\epsilon)$ codes. This concludes the proof.</p>
<p>How tight is this bound? It turns out that, for a given code with maximum-likelihood decoder, the bound given in \eqref{eq:metaconverse_given_code} is tight if one chooses the optimal auxiliary distribution $Q_{Y^n}$ suitably <sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup>.</p>
<p>Specifically, set $Q^\star_{Y^n}(y^n)=\frac{1}{\mu}\max_{\bar{x}^n}P_{Y^n|X^n}(y^n|\bar{x}^n)$ where $\mu$ is a normalization constant.
Note now that, for a given transmitted codeword $x^n$, the optimum ML decoder finds the correct codeword under the simplifying assumption of no ties if</p>
<script type="math/tex; mode=display">P_{Y^n|X^n}(y^n|x^n)\geq \max_{\bar{x}^n}P_{Y^n|X^n}(y^n|\bar{x}^n).</script>
<p>This is the same as requiring that</p>
<script type="math/tex; mode=display">\log\frac{P_{Y^n|X^n}(y^n|x^n)}{Q^\star_{Y^n}(y^n)}\geq \log \mu.</script>
<p>Furthermore, observe that, since the code has probability of error $\epsilon$ under ML decoding, the following must hold:</p>
<script type="math/tex; mode=display">P_{X^n}P_{Y^n|X^n}\left[\log\frac{P_{Y^n|X^n}(y^n|x^n)}{Q^\star_{Y^n}(y^n)}\geq \log \mu\right]=1-\epsilon</script>
<p>Hence,</p>
<script type="math/tex; mode=display">\beta_{1-\epsilon}(P_{X^n}P_{Y^n|X^n},P_{X^n}Q_{Y^n})=P_{X^n}Q_{Y^n}\left[\log\frac{P_{Y^n|X^n}(y^n|x^n)}{Q^\star_{Y^n}(y^n)}\geq \log \mu\right]=\frac{1}{2^k}.</script>
<h3 id="evaluation-of-the-metaconverse-bound-for-the-binary-awgn-channel">Evaluation of the metaconverse bound for the binary-AWGN channel</h3>
<p>Computing the metaconverse bound seems very hard because of the maximization over $P_{X^n}$ in \eqref{eq:converse}. It turns out that this maximization can be avoided if one chooses the auxiliary output distribution $Q_{Y^n}$ suitably.</p>
<p>Specifically, let <script type="math/tex">P_Y= \frac{1}{2}\mathcal{N}(-\sqrt{\rho},1)+\frac{1}{2}\mathcal{N}(\sqrt{\rho},1)</script> be the capacity achieving output distribution (see <a href="/posts/fbl-tutorial-2/">part 2</a>).</p>
<p>We choose $Q_{Y^n}$ as the product distribution $P^n_Y$. The key observation now is that because of symmetry, for every BPSK channel input $x^n$</p>
<script type="math/tex; mode=display">\beta_{1-\epsilon}(P_{Y^n | X^n=x^n}, Q_{Y^n} )= \beta_{1-\epsilon}(P_{Y^n | X^n=\bar{x}^n}, Q_{Y^n} ) \label{eq:symmetry}</script>
<p>where $\bar{x}^n$ is the all-one vector. One can then show (see Lemma 29 in <sup id="fnref:1:3"><a href="#fn:1" class="footnote">1</a></sup>) that when~\eqref{eq:symmetry} holds, then</p>
<script type="math/tex; mode=display">\beta_{1-\epsilon}(P_{X^n}P_{Y^n|X^n},P_{X^n}Q_{Y^n})= \beta_{1-\epsilon}(P_{Y^n | X^n=\bar{x}^n}, Q_{Y^n} )</script>
<p>Note that the right-hand side of this equality does not depend on $P_{X^n}$.
Hence, for this (possibly suboptimal) choice of $Q_{Y^n}$, the metaconverse bound simplifies to</p>
<script type="math/tex; mode=display">2^k\leq \frac{1}{
\beta_{1-\epsilon}(P_{Y^n | X^n=\bar{x}^n}, Q_{Y^n} )}.</script>
<p>Evaluating the beta function directly through the Neyman-Pearson lemma is challenging, because the corresponding tail probabilities are not known in closed form, and one of them needs to evaluated with very high accuracy (we encountered the same problem when we discussed in <a href="/posts/fbl-tutorial-3/">part 3</a> the evaluation of the RCU bound).</p>
<p>A simpler approach is to further relax the bound using the following inequality on the $\beta$ function (see p. 143 of the following <a href="http://people.lids.mit.edu/yp/homepage/data/itlectures_v5.pdf">lecture notes</a>):</p>
<script type="math/tex; mode=display">\beta_{1-\epsilon}(P_{Y^n| X^n=\bar{x}^n},Q_{Y^n}) \geq
\sup_{\gamma > 0} \left\{e^{-\gamma}\left(1-\epsilon- P_{Y^n | X^n=\bar{x}^n}\left[\log \frac{P_{Y^n| X^n}(Y^n|\bar{x}^n)}{Q_{Y^n}(Y^n)}\geq \gamma\right]\right)\right\}.</script>
<p>The resulting bound, which is a generalized version of Verdú-Han converse bound <sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup> can be evaluated efficiently.</p>
<p>Here is a simple matlab implementation.</p>
<div class="language-matlab highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span> <span class="n">rate</span><span class="o">=</span><span class="n">vh_metaconverse_rate</span><span class="p">(</span><span class="n">snr</span><span class="p">,</span><span class="n">n</span><span class="p">,</span><span class="n">epsilon</span><span class="p">,</span><span class="n">precision</span><span class="p">)</span>
<span class="n">idvec</span><span class="o">=</span><span class="n">compute_samples</span><span class="p">(</span><span class="n">snr</span><span class="p">,</span><span class="n">n</span><span class="p">,</span><span class="n">precision</span><span class="p">);</span>
<span class="n">idvec</span><span class="o">=</span><span class="nb">sort</span><span class="p">(</span><span class="n">idvec</span><span class="p">);</span>
<span class="n">ratevec</span><span class="o">=</span><span class="nb">zeros</span><span class="p">(</span><span class="nb">length</span><span class="p">(</span><span class="n">idvec</span><span class="p">),</span><span class="mi">1</span><span class="p">);</span>
<span class="k">for</span> <span class="nb">i</span><span class="o">=</span><span class="mi">1</span><span class="p">:</span><span class="mi">1</span><span class="p">:</span><span class="n">precision</span><span class="p">,</span>
<span class="n">prob</span><span class="o">=</span><span class="nb">sum</span><span class="p">(</span><span class="n">idvec</span><span class="o"><</span><span class="n">idvec</span><span class="p">(</span><span class="nb">i</span><span class="p">))/</span><span class="n">precision</span><span class="p">;</span>
<span class="n">ratevec</span><span class="p">(</span><span class="nb">i</span><span class="p">)</span><span class="o">=</span><span class="n">idvec</span><span class="p">(</span><span class="nb">i</span><span class="p">)</span><span class="o">-</span><span class="nb">log</span><span class="p">(</span><span class="nb">max</span><span class="p">(</span><span class="n">prob</span><span class="o">-</span><span class="n">epsilon</span><span class="p">,</span><span class="mi">0</span><span class="p">));</span>
<span class="k">end</span>
<span class="n">rate</span><span class="o">=</span><span class="nb">min</span><span class="p">(</span><span class="n">ratevec</span><span class="p">)/(</span><span class="n">n</span><span class="o">*</span><span class="nb">log</span><span class="p">(</span><span class="mi">2</span><span class="p">));</span>
<span class="k">end</span>
<span class="k">function</span> <span class="n">id</span><span class="o">=</span><span class="n">compute_samples</span><span class="p">(</span><span class="n">snr</span><span class="p">,</span><span class="n">n</span><span class="p">,</span><span class="n">precision</span><span class="p">)</span>
<span class="n">normrv</span><span class="o">=</span><span class="nb">randn</span><span class="p">(</span><span class="n">n</span><span class="p">,</span><span class="n">precision</span><span class="p">)</span><span class="o">*</span><span class="nb">sqrt</span><span class="p">(</span><span class="n">snr</span><span class="p">)</span><span class="o">+</span><span class="n">snr</span><span class="p">;</span>
<span class="n">idvec</span><span class="o">=</span><span class="nb">log</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span><span class="o">-</span><span class="nb">log</span><span class="p">(</span><span class="mi">1</span><span class="o">+</span><span class="nb">exp</span><span class="p">(</span><span class="o">-</span><span class="mf">2.</span><span class="o">*</span><span class="n">normrv</span><span class="p">));</span>
<span class="n">id</span><span class="o">=</span><span class="nb">sum</span><span class="p">(</span><span class="n">idvec</span><span class="p">);</span>
<span class="k">end</span>
</code></pre></div></div>
<p>In the next blog post (after the summer break), I will explain how to find easy-to-compute approximations to converse and achievability bounds.</p>
<hr />
<p>Go to back to <a href="/posts/fbl-tutorial-3/">part 3</a>
<!--or move on to [part 3](/posts/fbl-tutorial-3/)--></p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Y. Polyanskiy, H. V. Poor, and S. Verdú, “Channel coding rate in the finite blocklength regime,” IEEE Trans. Inf. Theory, vol. 56, no. 5, pp. 2307–2359, May 2010. <a href="#fnref:1" class="reversefootnote">↩</a> <a href="#fnref:1:1" class="reversefootnote">↩<sup>2</sup></a> <a href="#fnref:1:2" class="reversefootnote">↩<sup>3</sup></a> <a href="#fnref:1:3" class="reversefootnote">↩<sup>4</sup></a></p>
</li>
<li id="fn:2">
<p>G. Vazquez-Vilar, A. T. Campo, A. Guillén i Fàbregas, and A. Martinez, “Bayesian M-ary hypothesis testing: The meta-converse and Verdú-Han bounds are tight,” IEEE Trans. Inf. Theory, vol. 62, no. 5, pp. 2324 – 2333, May 2016. <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>S. Verdú and T. S. Han, “A general formula for channel capacity,” IEEE Trans. Inf. Theory, vol. 40, no. 4, pp. 1147–1157, Jul. 1994. <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Giuseppe Durisidurisi@chalmers.seFourth part of my series of posts about the design of URLLC through finite-blocklength information theory: the metaconverse theorem.