<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: The unicorn problem</title>
	<atom:link href="http://www.statisticsblog.com/2012/10/the-unicorn-problem/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.statisticsblog.com/2012/10/the-unicorn-problem/</link>
	<description>In Monte Carlo We Trust</description>
	<lastBuildDate>Thu, 02 May 2013 18:14:27 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
	<item>
		<title>By: Matt Asher</title>
		<link>http://www.statisticsblog.com/2012/10/the-unicorn-problem/comment-page-1/#comment-16878</link>
		<dc:creator>Matt Asher</dc:creator>
		<pubDate>Tue, 12 Feb 2013 16:40:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.statisticsblog.com/?p=695#comment-16878</guid>
		<description>Hi Pascal,

Nice! Reminds me of trading cards to McDonald&#039;s Monopoly game as a kid,  trying to complete a particular set. We always suspected that they made all but the last card in a set very rare, to prevent people from winning many of the big prizes. Given that our we were just a tiny fraction of the people who collected the cards, we were effectively sampling with replacement from the overall collection. 

Makes me wonder about trading, though, and how much it would increase your chances of finding your &quot;unicorn&quot;. Could make the simulation more interesting.</description>
		<content:encoded><![CDATA[<p>Hi Pascal,</p>
<p>Nice! Reminds me of trading cards to McDonald&#8217;s Monopoly game as a kid,  trying to complete a particular set. We always suspected that they made all but the last card in a set very rare, to prevent people from winning many of the big prizes. Given that our we were just a tiny fraction of the people who collected the cards, we were effectively sampling with replacement from the overall collection. </p>
<p>Makes me wonder about trading, though, and how much it would increase your chances of finding your &#8220;unicorn&#8221;. Could make the simulation more interesting.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Pascal</title>
		<link>http://www.statisticsblog.com/2012/10/the-unicorn-problem/comment-page-1/#comment-16877</link>
		<dc:creator>Pascal</dc:creator>
		<pubDate>Tue, 12 Feb 2013 15:53:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.statisticsblog.com/?p=695#comment-16877</guid>
		<description>Hi, I stumbled upon your nice blog. The &quot;unicorn problem&quot; is related to the classic coupon collector&#039;s problem (http://en.wikipedia.org/wiki/Coupon_collector%27s_problem), where you have n urns and at each step you place a ball into a randomly selected urn. The question is then how many balls you have to use so that every urn contains a ball. There are many generalizations of this problem, a Google search will help you find them.

The relation with your problem: instead of saying that the probability of finding a species is 1/20 at each trip, it is roughly equivalent to say that at each trip you find one (random) species, and then counting 50 trips as one (this approximation is ok, because 50 not much bigger than the square root of 1000). The expected number of trips necessary to find the 1000 species should therefore be roughly 1000 * log(1000) / 50, which is approximately 140.

Just checking at the graph: Seems to fit :)

You can also do the calculations directly for your model. This is actually very easy, since the random variables N_1,...,N_k,...,N_1000, where N_k is the number of trips you have to do to find species k, are independent. Each one is geometric with success probability 1/20, therefore their maximum is roughly 20*log(1000) (standard result from extreme value theory).</description>
		<content:encoded><![CDATA[<p>Hi, I stumbled upon your nice blog. The &#8220;unicorn problem&#8221; is related to the classic coupon collector&#8217;s problem (<a href="http://en.wikipedia.org/wiki/Coupon_collector%27s_problem" rel="nofollow">http://en.wikipedia.org/wiki/Coupon_collector%27s_problem</a>), where you have n urns and at each step you place a ball into a randomly selected urn. The question is then how many balls you have to use so that every urn contains a ball. There are many generalizations of this problem, a Google search will help you find them.</p>
<p>The relation with your problem: instead of saying that the probability of finding a species is 1/20 at each trip, it is roughly equivalent to say that at each trip you find one (random) species, and then counting 50 trips as one (this approximation is ok, because 50 not much bigger than the square root of 1000). The expected number of trips necessary to find the 1000 species should therefore be roughly 1000 * log(1000) / 50, which is approximately 140.</p>
<p>Just checking at the graph: Seems to fit <img src='http://www.statisticsblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>You can also do the calculations directly for your model. This is actually very easy, since the random variables N_1,&#8230;,N_k,&#8230;,N_1000, where N_k is the number of trips you have to do to find species k, are independent. Each one is geometric with success probability 1/20, therefore their maximum is roughly 20*log(1000) (standard result from extreme value theory).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Landmine detection revisited; the inverse unicorn problem &#171; Probability and statistics blog</title>
		<link>http://www.statisticsblog.com/2012/10/the-unicorn-problem/comment-page-1/#comment-16835</link>
		<dc:creator>Landmine detection revisited; the inverse unicorn problem &#171; Probability and statistics blog</dc:creator>
		<pubDate>Mon, 04 Feb 2013 15:28:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.statisticsblog.com/?p=695#comment-16835</guid>
		<description>[...] the existence of mines if they were very sparse in an area? In a sense, this is the inverse of the unicorn problem; instead of trying to find every last mine, we&#8217;re concerned with finding the very first one, [...]</description>
		<content:encoded><![CDATA[<p>[...] the existence of mines if they were very sparse in an area? In a sense, this is the inverse of the unicorn problem; instead of trying to find every last mine, we&#8217;re concerned with finding the very first one, [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Simulation of landmine clearing with Massoud Hassani&#8217;s Mine Kafon &#171; Probability and statistics blog</title>
		<link>http://www.statisticsblog.com/2012/10/the-unicorn-problem/comment-page-1/#comment-16710</link>
		<dc:creator>Simulation of landmine clearing with Massoud Hassani&#8217;s Mine Kafon &#171; Probability and statistics blog</dc:creator>
		<pubDate>Fri, 11 Jan 2013 00:29:30 +0000</pubDate>
		<guid isPermaLink="false">http://www.statisticsblog.com/?p=695#comment-16710</guid>
		<description>[...] you look on my blog you&#8217;ll see a post I did about something I call The Unicorn Problem, related to finding all of the new species in an environment. The problem there, as with this [...]</description>
		<content:encoded><![CDATA[<p>[...] you look on my blog you&#8217;ll see a post I did about something I call The Unicorn Problem, related to finding all of the new species in an environment. The problem there, as with this [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt Asher</title>
		<link>http://www.statisticsblog.com/2012/10/the-unicorn-problem/comment-page-1/#comment-13396</link>
		<dc:creator>Matt Asher</dc:creator>
		<pubDate>Mon, 15 Oct 2012 14:14:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.statisticsblog.com/?p=695#comment-13396</guid>
		<description>@Jan Galkowski

Very interesting problem! Kind of like the German Tank Problem (http://www.statisticsblog.com/2010/05/how-many-tanks-gtp-gets-put-to-the-test/) for species. Would be interesting to test the theorem you linked with an MC simulation.</description>
		<content:encoded><![CDATA[<p>@Jan Galkowski</p>
<p>Very interesting problem! Kind of like the German Tank Problem (<a href="http://www.statisticsblog.com/2010/05/how-many-tanks-gtp-gets-put-to-the-test/" rel="nofollow">http://www.statisticsblog.com/2010/05/how-many-tanks-gtp-gets-put-to-the-test/</a>) for species. Would be interesting to test the theorem you linked with an MC simulation.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jcb</title>
		<link>http://www.statisticsblog.com/2012/10/the-unicorn-problem/comment-page-1/#comment-13395</link>
		<dc:creator>jcb</dc:creator>
		<pubDate>Mon, 15 Oct 2012 07:32:12 +0000</pubDate>
		<guid isPermaLink="false">http://www.statisticsblog.com/?p=695#comment-13395</guid>
		<description>DX hunting. That&#039;s a radio amateur activity and the uniform discoverability rate you assume above does not hold at all.</description>
		<content:encoded><![CDATA[<p>DX hunting. That&#8217;s a radio amateur activity and the uniform discoverability rate you assume above does not hold at all.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jan Galkowski</title>
		<link>http://www.statisticsblog.com/2012/10/the-unicorn-problem/comment-page-1/#comment-13393</link>
		<dc:creator>Jan Galkowski</dc:creator>
		<pubDate>Mon, 15 Oct 2012 00:43:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.statisticsblog.com/?p=695#comment-13393</guid>
		<description>How about an even more interesting problem: Given the rate of finding new species, and &quot;juicing&quot; the calculation by using estimates of weights on how efficacious certain sampling places and schemes are (&quot;importance weights&quot;), determine how many more species remain to be found. 

References for students of the matter, as I am:

http://oregonstate.edu/instruct/st571/urquhart/var_prob/sld011.htm

Overton, Stehman, &quot;The Horvitz-Thompson Theorem as a unifying perspective for probability sampling: With examples from natural resources sampling,&quot;  THE AMERICAN STATISTICIAN, 49(3), August 1995, pp 261ff.

A. R. Solow, W. K. Smith, &quot;Estimating species number under an inconvenient abundance model,&quot;  JOURNAL OF AGRICULTURAL, BIOLOGICAL, AND ENVIRONMENTAL STATISTICS, 14: 242-252, 2009.</description>
		<content:encoded><![CDATA[<p>How about an even more interesting problem: Given the rate of finding new species, and &#8220;juicing&#8221; the calculation by using estimates of weights on how efficacious certain sampling places and schemes are (&#8220;importance weights&#8221;), determine how many more species remain to be found. </p>
<p>References for students of the matter, as I am:</p>
<p><a href="http://oregonstate.edu/instruct/st571/urquhart/var_prob/sld011.htm" rel="nofollow">http://oregonstate.edu/instruct/st571/urquhart/var_prob/sld011.htm</a></p>
<p>Overton, Stehman, &#8220;The Horvitz-Thompson Theorem as a unifying perspective for probability sampling: With examples from natural resources sampling,&#8221;  THE AMERICAN STATISTICIAN, 49(3), August 1995, pp 261ff.</p>
<p>A. R. Solow, W. K. Smith, &#8220;Estimating species number under an inconvenient abundance model,&#8221;  JOURNAL OF AGRICULTURAL, BIOLOGICAL, AND ENVIRONMENTAL STATISTICS, 14: 242-252, 2009.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: M.P.</title>
		<link>http://www.statisticsblog.com/2012/10/the-unicorn-problem/comment-page-1/#comment-13391</link>
		<dc:creator>M.P.</dc:creator>
		<pubDate>Sun, 14 Oct 2012 19:25:15 +0000</pubDate>
		<guid isPermaLink="false">http://www.statisticsblog.com/?p=695#comment-13391</guid>
		<description>The biggest problem w the assumptions is that all species are equally likely to be encountered each time. if you made some more rare than others I bet it would take even longer to find your unicorn.</description>
		<content:encoded><![CDATA[<p>The biggest problem w the assumptions is that all species are equally likely to be encountered each time. if you made some more rare than others I bet it would take even longer to find your unicorn.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
