Dreams being a way to avoid brain overfitting.?

I was thrilled to read a short article in technologynetworks.com entitled “The Weirdness of Our Dreams Could Explain Their Function“. Below, I am citing selected parts of this popular article that is based on the study in the references. A final comment is also added.

The main idea is the following: Our brain are mimicked by neural networks. Neural networks, if trained repeatedly on a limited set of trial data, struggle when presented with test data that differs from the trial. Think of an autonomous car trained only on US roads being subsequently asked to navigate the densely packed streets of New Dehli. To avoid overfitting, AI researchers introducing corruptions in the training data and occasional blanks in the data, a technique called dropout.

Can we look at the process of dreaming as a similar natural way to make our brains less overfitted and being more capable to learn? The overfitted brain, says Hoel’s paper, might still be able to learn and memorize things, but struggles to generalize that information. He gives the example of someone learning a new video game. The player might grasp the basics quickly but then plateau, only to find their performance starts to improve again after a night’s sleep. Could this be due to more than a good rest – could the brain be benefiting from weird dreams that give the brain a new perspective on the rote mechanics of a novel task?

One reason for me to be excited about this article is that usually we make hypothesis about how the performance of a neural network can be improved by looking at the way the brain is functioning. Here, the reasoning is backwards – we make a hypothesis on the way the brain functions based on a technique used in neural nets.

A second reason to find this inspiring is that if the described assumption is true, then we are naturally created to learn and understand general concepts!

References:

Hoel E. The overfitted brain: Dreams evolved to assist generalization. PATTER. 2021;2(5). doi:10.1016/j.patter.2021.100244

Posted in Machine Learning | Tagged , , | Leave a comment

The development of the probability concept and some great ideas about chance

As probability is a main tool when making all kind of decisions, it is important to trace back and know some stuff about how the idea and the concept of probability developed.

This post consists mainly of selected quotes from the book “Ten great ideas about chance”, written by two professors at Stanford – Diaconis and Skyrms, but there are some additional sentences, some of which from the other cited sources, and other explanations added. I strongly recommend the mentioned book [1], since it’s written in a very intriguing way by amazing authors!

J.Bernoulli

Prior to Jacob Bernoulli and it’s “Ars Conjectandi” (written 1689),  where he proved the weak law of large numbers (WLLN), the top mathematicians have not identified probability with frequency. For them (leaded by Laplace and De Morgan), probability was a form of rational degree or belief. What was frequency then and what was the connection between frequency and probability? The WLLN establishes such a relationship.

What Bernoulli proved was his golden theorem from which the WLLN follows. If you have given the chance of a random event E to happen (in an experiment) and some interval I for the frequency of happening of E, when repeating the experiment a number of times (making n trials), then Bernoulli derived an upper bound on the required number of trials, so that the observed frequency of E would lie in I with certain probability.

This is an inference problem from chances to frequencies, but not the inverse problem – from frequencies to chances. Yet, Bernoulli somehow convinced himself that he had solved the inverse inference problem?! How did he do so? He basically argued that since when we have large enough number of trials, the observed frequency would be (approximately) equal to chance. And since, the probability that these two quantities are not equal, is very small, then we can treat them as the same thing. But if frequency equals chance then chance equals probability, so the inverse problem is also solved. This is the Bernoulli’s swindle and it is a big fallacy, because if one tries to formalize it, he sees that the conditional probabilities go in different directions. The inverse problem was actually solved by Thomas Bayes. The mantra that we should identify relative frequencies and probabilities was repeated even in the 20th century, by distinguished probability theorist like Borel, Markov and Kolmogorov.

Very similar story with the hypothesis testing…There, for example, to test whether a drug is effective, given some data, one would naturally want to know what is the probability that the drug is effective, given the data. Instead, when confirming this hypothesis, we are saying that the probability of observing the given data, if the drug was not effective, is pretty small.! Almost like in the Bernoulli’s swindle…

John Venn

John Venn was the first one to make a full-length exposition of the frequentist view in “The Logic of Chance” (1866), but he also wrestles with the exact formulation. For him, probability is the limit of the relative frequency as the number of trials goes to infinity. This idea has risen some new mathematical issues like the fact that there are sequences such that the limiting relative frequencies does not exist, but fluctuates and never approaches a limit…or another one is that limiting frequencies may not add properly (when for ex. you add infinite number of limiting frequency)…Venn’s theory appears to be full of holes, but it is to his credit that he saw most of them himself.

Richard Von Mises

Then, it came Richard Von Mises, who set out to put the theory of probability on a sound mathematical basis. The challenge, was of course raised by David Hilbert (his problem number 6) on the famous congress in Paris at 1900, where as special emphasis was given to the role of probability in statistical physics. Von Mises also interpreted probability as relative frequency, but just in a specific type of infinite sequences that are having additional properties. The first extra property was the existence of the limiting frequency (not very surprising)! But this was not enough…We want more – whenever the term probability is used it should relate to a (limit of a) frequency. Thus, we want also to know which sequences  (with the frequency limits in them) can be associated to probabilities! And just the existence of a limiting frequency does not characterises the ‘good’ sequences. Here is one sequence that has a limiting frequency, but R. von Mises didn’t consider this sequence ‘good’: Imagine a coin outcome sequence where head always follows tail and tail always follows head. The limiting frequency of heads is 50%. Can we associate probability of 1/2 just to one such sequence, though?

One problem is that we can reorder the tosses in our sequence, so that it converges to any value in [0, 1] that we like. (If this is not obvious, consider how the relative frequency of even numbers among positive integers, which intuitively ‘should’ converge to 1/2, can instead be made to converge to 1/4 by reordering the integers with the even numbers in every fourth place, as follows: 1, 3, 5, 2, 7, 9, 11, 4, 13, 15, 17, 6, …). First of all, why should one ordering be privileged over others? A way to avoid looking at sequences suffering from this problem is to impose the requirement of randomness of the considered sequences, i.e. the relative frequencies should be invariant under selection of subsequences in some specified manner. In other words, in our special type of ‘good’ infinite sequences, the relative frequency of an event should be the same for any infinite subsequence that one might select. Von Mises called these ‘good’ sequences – Kollektivs and they have both properties – existence of limiting relative frequency and randomness.

Another part of the motivation of R. Von Mises’s for the randomness requirement is his understanding that any probability statement relates to an aggregate phenomenon, rather than to an individual isolated event, e.g. some sequence with a given fixed ordering without considering the other orders and respectively subsequences). Here is another quote from him:

“The probability of dying within the coming year may be $p_1$ for a certain individual if he is considered as a man of age 42, and it may be $p_2 \neq p_1$ if he is considered as a man between 40 and 45, and $p_3$ if he is considered as belonging to the class of all men and women in the United States. In each case, the probability value is attached to the appropriate group of people, rather than to the individual”

In fact, with the randomness requirement, von Mises tries to capture the independence of successive tosses directly, without invoking the product rule! At this point a natural question may arise: why do we need Kollektivs at all? Why isn’t it sufficient to use the distribution (as in effect happens in Kolmogorov’s theory we will mention later) instead of the unwieldy formalism of Kollektivs? The answer is that Kollektivs are a necessary consequence of the frequency interpretation, in the sense that if one interprets probability as limiting relative frequency, then infinite series of outcomes will exhibit Kollektiv-like properties. Therefore, if one wants to axiomatise the frequency interpretation, these properties have to be built in.

Yet, the next logical questions is – what could be the allowed ways to select subsequences in a Kollektiv? Or equivalently – what is a truly random sequence?

For von Mises, the property we call randomness can be explicated – or even defined – by the impossibility of devising a successful system of gambling:

“A boy repeatedly tossing a dime supplied by the U. S. mint is quite sure that his chance of throwing “heads” is ½. But he knows more than that. He also knows that if, in tossing the coin under normal circumstances, he disregards the second, fourth, sixth, …, turns, his chance of “heads” among the outcomes of the remaining turns is still ½. He knows—or else he will soon learn—that in playing with his friends, he cannot improve his chance by selecting the turns in which he participates. His chance of “heads” remains unaltered even if he bets on “heads” only after “tails” has shown up three times in a row, etc. This particular feature of the sequence of experiments appearing here and in similar examples is called randomness. “

This is somewhat related to the comprehension of probability as a subjective quantity, i.e. not anything related to sequences,etc., but a degree of belief. Subjective probabilities are traditionally analyzed in terms of betting behavior. The reason is that if one should define a degree of belief, one good try is the following:

Your degree of belief in E is p iff p units of utility is the price at which you would buy or sell a bet that pays 1 unit of utility if E, 0 if not E.

But still, can we allow all possible subsequences, for which to want the limiting relative frequency to be the same? Apparently no, because for any given binary sequence, we can always the subsequence of those positions, where we have 1s. The relative frequency of the 1s in this particular subsequence won’t be the same (unless we don’t have any 0s in the initial sequence). This, we cannot really construct a Kollektiv explicitly, because if we have it, one can create such a ‘bad’ subsequence. To this objection, von Mises answered that Kollektivs are new mathematical objects, not constructible from previously defined objects, i.e. they are not to be thought of as numbers, i.e. known objects.

Interestingly, Richard Von Mises had a brother called Ludwig Von Mises, who proposed his own theory of probability (see [2]), but  it didn’t became that popular.

There are important differences between Richard and Ludwig von Mises’s respective views about randomness, or “indeterminism.” Richard von Mises was heavily influenced by the work of Heisenberg, whose work was interpreted by Richard to have established the basic indeterminism of the world at both the macrophysical and microphysical levels. This view of the world as inherently indeterministic allows Richard to take the position that probabilities are objective “physical properties” of things in the world:

“The probability of a 6 is a physical property of a given die and is a property analogous to its mass, specific heat, or electrical resistance. Similarly, for a given pair of dice (including of course the total setup) the probability of a ‘double 6’ is a characteristic property, a physical constant belonging to the experiment as a whole and comparable with all its other physical properties”

Ludwig von Mises, on the other hand, does not follow his brother down this indeterministic road. In the first place, Ludwig was a determinist, who held that everything that occurs in the world has a prior cause.

Kolmogorov

Going back to the ideas of R. von Mises, he held that in order to understand Kollektives, one should always bear in mind the analogy with the idealized objects in geometry. Indeed, as the point, the line, the circle, etc. are just idealised objects, so are the Kollektives. For example, as in practice, you cannot have 2 points in the plane that are exactly at some given distance d, the same way you cannot point out a sequence that is an exact Kollektiv!

But, …, if such idealisations are permissible, then can we use even more idealised notion of probability that will make our life easier? Could we not just have objective chances – some idealised quantities assigned to any physical situation or an experiment, that shows what is the tendency (or ‘propensity’) of a certain outcome to be observed.? This is what is called now a ‘propensity view’ of probabilities. As Sir Karl Popper stated (he came up with a propensity theory, independently from Charles Pierce who was first), the outcome of a physical experiment is produced by a certain set of “generating conditions”. When we repeat an experiment, as the saying goes, we really perform another experiment with a (more or less) similar set of generating conditions. Thus, we may look at chances as quantities related to physical experiments, that have objective existence in the world.

Such a view would have needed a framework and here came the measure-theoretic framework of Kolmogorov. The main contribution of the framework is that it looks at random quantities (variables) as measurable functions, the theory of which was developed ~30 years prior to Kolmogorov, by Borel and Lebesgue. The book [1] cites some very interesting  words of Mark Kac – a renown probabilitist, who said that back in 1933-34, he was wondering what exactly are the random quantities $X_{1},X_{2},\ldots$ , that he read about in a work by A.Markov. Kac new everything about measure theory, but in the 1930s, people still hadn’t internalised the connection.

The mathematical object that Kolmogorov used to study probability is what we are all familiar from our undergrad classes. It is a triple:

$\langle X, \mathbb{F}, \mathbb{P}\rangle$,

related to an experiment, where $X$ is the set of possible outcomes, $\mathbb{F}$ is a set of subsets of $X$ – those things that have probabilities and  $\mathbb{P}$ is a non-negative real-valued function on $\mathbb{F}$.

This triple is called probability space and $\mathbb{F}$ and $\mathbb{P}$ have some additional properties – $\mathbb{F}$ is closed under taking unions, negations and intersections countable many times and $\mathbb{P}$ is countably additive, with $\mathbb{P}(X) = 1$. So , the random quantities (or variables) are no longer mysterious objects – they are just measurable functions!

To modern eyes, Kolmogorov’s axioms look very simple, and one may well wonder why it took such a long time for probability theory to mature. One reason appears to be that probability was considered to be a branch of mathematical physics (this is how Hilbert presented it), so it was not immediately apparent which part of the real world should be incorporated in the axioms.

Another main contribution of Kolmogorov here, was that he properly formalizes conditional probability.

The Geneva conference

In 1937, University of Geneva organized a conference on the theory of probability where the focal point of the discussion was von Mises’ axiomatisation of probability theory, and especially its relation to the newly published axiomatisation by Kolmogorov. An excellent reading here is [3], where the arguments of the 2 sides are explained in much more details.

In summary, as a result of this conference, the framework of Kolmogorov was established as the standard framework when considering probabilities and a crucial role in the debate had an example by John Ville who showed that the law of the iterated logarithm cannot be derived via the theory of von Mises.
However, there are still people who are proponents of the frequentist view and it seems that taking a position in this debate is also a matter of philosophical preferences!

References:

1. P.Diaconis, B. Skyrms, “Ten great ideas about chance”
2. M. Crovelli, A CHALLENGE TO LUDWIG VON MISES’S THEORY OF PROBABILITY, https://mises-media.s3.amazonaws.com/-2-23_2.pdf
3. M. van Lambalgen, “RANDOMNESS AND FOUNDATIONS OF PROBABILITY: VON MISES’ AXIOMATISATION OF RANDOM SEQUENCES”, https://pdfs.semanticscholar.org/853a/5cdd7c2e443f898dca230d31ac4556970d76.pdf
4. https://philosophy.stackexchange.com/questions/56035/what-is-frequentism
5. Hájek, Alan, “Interpretations of Probability”, The Stanford Encyclopedia of Philosophy, https://plato.stanford.edu/entries/probability-interpret/#CriAdeForIntPro

Important first notes on Machine Learning and the Bias-Variance Trade-Off

I am listing a few introductory comments about ML, that are quite important, but that you cannot hear very often.

0. Motivation and Popularity

0’.[ML popularity] The ML field became so popular in the last years, because it is a theory that helps you get new knowledge out of data. Many companies realised that they can use their huge amounts of data somehow, to get more money, and along with that – to improve their products. Of course, this became possible, also because the capability to store huge amounts of data.

0’’. On the other hand, working as a data analyst/scientist/ML engineer became very popular among people with Math and Stat-related education, not only because the salaries are comparable to those in Soft. Engineering, but also because the work itself is supposed to be more interesting for many of the aforementioned people, since it gives them the opportunity to get new insights from the data and that is, at least in theory,  really exciting!

Last but not least, working as a Data Scientist, you could directly help for an improvement of a certain product, so you feel you have a significant impact…which is quite important these days. To have big impact and to work for good causes should be, I think, the number 1 objective of an employee when choosing a job position and a company nowadays. However, many people still neglect that and prefer to work for a casino instead of a cancer research institution, just because the former place offers 5% bigger salary. Given that in practice, both salaries are enough to cover all his needs, such a preference is definitely sad…but this is another topic..let’s go back to ML..

1.Relations to other fields

1’. [ML, DS and AI] Together with the words “Machine Learning(ML)“, one probably hears a lot the similar terms “Data Science(DS)” and “Artificial Intelligence(AI)”.

One easy association in order to remember the main differences between these terms is the following:
Data science produces insights

Machine learning produces predictions

Artificial intelligence produces actions.

1’’. [ML and Statistical Learning] After all, in ML, the task of the algorithms used is to process examples randomly generated from an unknown distribution in order to draw conclusions for this distribution. But, wait, didn’t we have the same settings and goals in Statistics? This is indeed the case, but there are some differences:

• In Statistics, we usually have some given set of possible hypothesis, while in ML we sometimes hope that some algorithms could figure out meaningful patterns(or hypothesis).
• In Statistics, one is not interested in the computational efficiency of the applied techniques. In ML, the execution of the learning process by a computer is central, hence the algorithmic complexity also is.
• In Statistics, we are often interested in asymptotic behaviour of some quantities(statistics) when the sample sizes grow to infinity. In contrast, the theory of ML focuses on finite sample bounds and the degree of accuracy that a learner can expect on the basis of such samples.

2. The Bias-Variance Trade- off

The bias-variance(or Bias-Complexity) trade-off is one crucial thing one should understand at the very beginning, when starting to study ML. To do that, one should first understand the term “overfitting”, which probably all of the readers of this post already know well, but let’s give one fun example that they could also enjoy.

My favourite example of overfitting is the “3964 formula” discovered before the World Cup soccer competition in 1998: Brazil won the championships in 1970 and 1994. Sum up these 2 numbers and you will get 3964; Germany won in 1974 and 1990, adding up again to 3964; the same thing with Argentina winning in 1978 and 1986 (1978+1986 = 3964). This is a very surprising fact, but everyone can see that it is not advisable to base any future prediction on that rule. And indeed, the rule gives that the winner of the World Cup in 1998 should have been England since 1966 + 1998 = 3964 and England won in 1966. This didn’t happen and the winner was France!

What was the problem with the used model? We relied heavily on a relationship between a known and unknown variables, that is simply not there, not part of the true relationship. This is related to the name ‘overfitting’ – in some sense we have fitted our model to the available data too much.

Just looking at the last sentence, it is not crystal clear. First of all, why is this fitting too much? How did we know? What exactly is going on here? To figure out, let’s introduce some simple notations. We will then derive and explain the bias-variance tradeoff.

Formalization:

Imagine the trivial example of credit scoring. You are a bank and a customer comes to you with the request to give him a credit. We can either say yes or no and, of course, we want to make the better decision. As a bank, we have some data for each potential customer in the country: age, salary, education, credit history, credit amounts in other banks, etc. This is a vector of predictors $X = (x_{1},x_{2}, \ldots , x_{p})$.

• Input: $X = (x_{1},x_{2}, \ldots , x_{p})$ [the customer’s data]

So we can look at a particular customer as a datapoint in the set of all customers $\mathcal{X}$. We want to come up with a decision y = 0 or 1 – we do not or we do give him a credit.

• Output: $y$ [good/bad customer]

How can we make a good decision? The bank has tons of data from some previous customers, i.e. a number of input vectors X, for the customers that they have given credit before and the respective outcome – whether they have defaulted or not. The model that we are looking for could use this data substantially. In fact, a model here is actually a function from $\mathcal{X}$  to $\{0,1\}$. Let the true such function be denoted with f.

• Unknown target function: $f: \mathcal{X} \to \{0,1\}$

In addition, we have the previous data

• Data: $\mathcal{D} = ((X_{1},y_{1}),(X_{2},y_{2}), \ldots , (X_{N},y_{N}))$ [historical records]

Based on this data, we should propose some function g, that is as close to f, as possible. The way to propose such a function is by some algorithm, called a learning algorithm. It is actually a way, given the data, to select a function g out of some set of functions(hypothesis) – $\mathcal{H}$. We are those who determine the set $\mathcal{H}$, e.g. we can take it to be the set of all neural networks or the set of all polynomials of total degree up to 5, etc… One reason to restrict ourself here, instead of considering all possible formulas is that this way, we generally decrease the chance of overfitting? We will see why is this true soon..

To understand the bias-variance tradeoff, we will slowly follow the derivation of the main equation related to it. The notations that we will use are borrowed from the great lectures of prof. Abu-Mostafa(see [5]):

In Learning, what we are interested in is minimizing

$E_{out}(g^{(\mathcal{D})}) = \mathbb{E}_{x}[(g^{(\mathcal{D})}(x)-f(x))^{2}],$

i.e. the out of sample error at a given point x, of the function g, that will be returned by our learning algorithm if it has as an input some particular training data $\mathcal{D}$. In fact, when evaluating a learning algorithm, we should look at the performance for any possible training data that could be sampled, so we are actually interested in the expected value of this error over all possible training sets:

$\mathbb{E}_{\mathcal{D}}[E_{out}(g^{(\mathcal{D})})] = \mathbb{E}_{\mathcal{D}}[\mathbb{E}_{x}[(g^{(\mathcal{D})}(x)-f(x))^{2}]] = \mathbb{E}_{x}[\mathbb{E}_{D}[(g^{(\mathcal{D})}(x)-f(x))^{2}]]$

In the second inequality, we just changed the order of taking expectation. We are allowed to do that, because of the Fubini’s theorem (see this stackexchange questions). We did this, because we have seen we can say something about the inner part.

What is it? Indeed, how can we evaluate $\mathbb{E}_{D}[(g^{(\mathcal{D})}(x)-f(x))^{2}]$? If we foil out the square and take the individual expectations, we will have sum of three expectations and neither of them could be simplified further…But the second term will involve $\mathbb{E}_{D}[g^{(\mathcal{D})}(x)]$. We know that $\mathbb{E}_{D}[(g^{(\mathcal{D})}(x) - \mathbb{E}_{D}[g^{(\mathcal{D})}(x)]] = 0$, so then we could add and substract $\overline{g}(x) = \mathbb{E}_{D}[g^{(\mathcal{D})}(x)]$ to make one of the three terms 0. Let us call this quantity – the average hypothesis $\overline{g}$. Does it have any interpretation? Well, if we imagine that the possible training data sets that one can give us are $\mathcal{D}_{1},\mathcal{D}_{2},\ldots , \mathcal{D}_{K}$, then

$\overline{g}(x) = \mathbb{E}_{D}[g^{(\mathcal{D})}(x)] = \frac{1}{K}\sum\limits_{i=1}^{K} g^{(\mathcal{D}_{i})}(x)$

Of course, often, the number of possible training data sets is infinite(for ex. when each datapoint is sampled from a continuous distribution). However, the later equation give us some intuition about the ‘average’ hypothesis. For a given x, it is simply the average value that our learning algorithm  will return at the point x, for all possible sets of training data that we might have. Then, the average hypothesis is a function returning this average for any x.

Back to our calculation, we will have

$\mathbb{E}_{D}[(g^{(\mathcal{D})}(x)-f(x))^{2}] =$

$\mathbb{E}_{D}[(g^{(\mathcal{D})}(x)-\overline{g}(x) + \overline{g}(x) - f(x))^{2}] =$

$= \mathbb{E}_{D}[(g^{(\mathcal{D})}(x)-\overline{g}(x))^{2} + (\overline{g}(x) - f(x))^{2} + 2(g^{(\mathcal{D})}(x)-\overline{g}(x))(\overline{g}(x) - f(x))] =$

$= \mathbb{E}_{D}[(g^{(\mathcal{D})}(x)-\overline{g}(x))^{2} + (\overline{g}(x) - f(x))^{2}]$

since, as we just said, $\mathbb{E}_{D}[g^{(\mathcal{D})}(x)-\overline{g}(x)] = \mathbb{E}_{D}[(g^{(\mathcal{D})}(x) - \mathbb{E}_{D}[g^{(\mathcal{D})}(x)]] = 0$.

We obtained

$\mathbb{E}_{D}[(g^{(\mathcal{D})}(x)-\overline{g}(x))^{2} + (\overline{g}(x) - f(x))^{2}]=$

$= \mathbb{E}_{D}[(g^{(\mathcal{D})}(x)-\overline{g}(x))^{2}] + \mathbb{E}_{D}[(\overline{g}(x) - f(x))^{2}]$

But the second term does not depend on $\mathcal{D}$, so it will be a constant. Then, we get

$\mathbb{E}_{D}[(g^{(\mathcal{D})}(x)-f(x))^{2}] =$

$= \mathbb{E}_{D}[(g^{(\mathcal{D})}(x)-\overline{g}(x))^{2}] + (\overline{g}(x) - f(x))^{2}=$

$= variance(x) + bias(x)$

Therefore,

$\mathbb{E}_{\mathcal{D}}[E_{out}(g^{(\mathcal{D})})] = \mathbb{E}_{x}[\mathbb{E}_{D}[(g^{(\mathcal{D})}(x)-f(x))^{2}]]=$

$=\mathbb{E}_{x}[variance(x) + bias(x)] =$

$= variance + bias$

We call the later 2 expectations variance and bias. Once again:

$variance = \mathbb{E}_{x}[\mathbb{E}_{\mathcal{D}}[(g^{(\mathcal{D})}(x)-\overline{g}(x))^{2}]]$

and

$bias = \mathbb{E}_{x}[(\overline{g}(x) - f(x))^{2}]$.

Note that the quantity $variance(x)$ is the actual variance of the prediction value that our learning algorithm will give if we look at it as a random variable over all possible training datasets.

Here is the place to clarify the term overfitting. Well, if for each given x, our prediction is quite different for different training datasets(high $variance(x)$), then usually, slight changes in one or a few training examples, changes the algorithm’s choice of a hypothesis a lot. This probably means that we are using some rules that rely substantially on every single datapoint, i.e. we bet that some quite sophisticated patterns are present in our data.

Similarly, in practice, some wrong decisions are usually made, because we rely too much on our current experience, that is limited and not relevant to the big picture and the true relationships that determine what will actually happen. Exactly, as in the example with the “3964 formula”!

However, we should note that this example is not the best one to explain the bias-variance tradeoff, just because the rule was made up after looking at the data. As you can see, the bias-variance tradeoff is important when evaluating a learning algorithm that is fixed, regardless of the training data.

So we decomposed the MSE into 2 terms that measure:

1. How well $\mathcal{H}$ can approximate the true, but unknown relationship, $f$?
2. How easy is to find a good hypothesis $h$ in the set $\mathcal{H}$, just knowing some given number of training datapoints?

On the picture above, one can think that he/she has some fixed testing point x. We want the sum of “the distance” between $f$ and $\overline{g}$/the bias/ and the radius of the circle/the variance, i.e. the complexity of our learning procedure/ to be small. Since we are looking at the expected out of sample mean squared lose(MSE), this variance in our prediction will result in higher MSE, in expectation. Note that if we use the mean absolute error or other loss functions, then the situation is quite different.

Remark 1: We should balance between the two. If $\mathcal{H}$ is too big, as a set, and includes  pretty much all the possible models, then the true $f$ generally tend to be very close to the average hypothesis. The bias term will be small. However, just by the (often limited) training data, we won’t be able to point out one such good $h$, close to $f$. Conversely, if $\mathcal{H}$ is too small, then the variance in our prediction, for each x, will probably not be that big, but “the distance” between the true target function $f$ and the average hypothesis will be quite big.

Remark 2: When we have small number of training data points, it is advisable to use simpler models so that the variance term is not too big. On the other hand, if the selected model is too simple, e.g. a straight line, then the bias term will be big, unless the true relationship is indeed linear or close to linear.

Remark 3: It is also important to know that, in practice, it is not possible to explicitly compute the test MSE, bias or variance for a given learning method. For a specific example, where the calculation of the bias and variance terms, when learning a sinusoidal function, is shown explicitly, see at 30:55 in the video [5].

References:

1. Pedro Domingos, “A Few Useful Things to Know about Machine Learning
2. S.Shalev-Shwartz, S. Ben-David, “Understanding Machine Learning”
3.  T.Hastie, R. Tibshirani, J. Friedman, “Elements of Statistical Learning”
4. Learning 01 – The Learning problem
5. Learning 08 – The Bias-Variance Tradeoff

Best from Knuth’s impressive book on the Bible [Part 3: 1Kings – Ezra]

-1Kings

The editor of the book emphasizes on the moral of the story, not the bare facts. For example, he dismisses king Omri with only a few lines and says, “Omri sinned against God more than any of his predecessors”(16:25), although by worldly standards Omri was actually one of the greatest kings of this era(Indeed, Israel was known to Assyrians as the land of Omri for many years).

The Jewish historian Josephus retold the story of the 2 prostitutes arguing who is the mother of a living child and who is the mother of a dead one(both women lived in the same house). Solomon determined who is the true mother by ordering the living child to be bisected. The real mother didn’t allowed. According to Josephus, Solomon ordered both babies to be bisected, so that each women would receive half of each child. This makes the judgement curiously like Exodus 21:35.  Similar stories have been found in other parts of the world. The version closest to this Biblical account is one of the tales of the Buddha’s former incarnations in India.

Solomon was endowed with different types of wisdom, including ability to compose poems and proverbs and unusual knowledge of plants and animals(4:32,33). He was not, however, said to have excelled at mathematics, which was already developed in Mesopotamia, Perhaps that is why he spent too much money 🙂

-2Kings

The original Hebrew Bible had only two books in their place, called Samuel and Kings, but the first Greek and Latin translations of the Old Testament(including Eastern Orthodox Bibles) chopped them into two parts each and called them ‘1Kingdoms’ through ‘4Kingdoms’.

The continuity of David’s line is stressed repeatedly; for example, in most cases the southern kings are said to be buried in the city of David and the queen mother’s name is given, but such information almost never appears for the northern kingdom(David comes from the southern). Clearly, God’s promise is not annulled(8:19, 20:6).

-1Chronicles

This is the first of 4 Biblical books written by a man whom scholars call ‘the Chronicler’. The sons of Levi, responsible for Israel’s religious activities are explicitly mentioned more than 100 times, but only 4 times in the books of Samuel and Kings. Thus the Chronicler was probably a Levite.; perhaps, in fact, he was a cantor, because music is especially prominent here.

Verses 3:10-14 list the kings of Judah(southern kingdom) from Solomon to Josiah, in each case listing only the one son who succeeded to his father’s throne. We can see that Davidic dynasty was very impressive indeed: the crown passed from father to son in an unbroken string extending through 18 generations, covering almost 400 years.

-2Chronicles

The books of Chronicles have often been unjustly neglected, perhaps cause they originally appeared at the very end of the Hebrew Bible. Even after they were moved by the Greek scholars, they received the name Paraleipomenon – ‘leftovers’, implying that they just fill some gaps in stories already been told(in Samuel and the Kings).

Bible scholars once doubted that these books were written before the events they describe. However, after some recent archeological discoveries, the balance of opinion changed.

-Ezra

Ezra, whose name means ‘help’, led a large group of Jews back to Palestine after they have been in captivity near Babylon. Several leading specialists in Old Testament history believe that Ezra himself was the Chronicler.

Best from Knuth’s impressive book on the Bible [Part 2: Joshua – 2Samuel]

See Part 1 for intro.

Joshua:

Probably the most debatable verse here is 10:13 which says that the sun happened to stand still for a while, as a result of Joshua’s prayer(read more on the question here).

Pacifist are not fond of this book. In contrast with God’s commandment not to kill(Exodus 20:13), Israelites were directed to massacre every living thing in Jericho(6:21). According to Psalm 106:34, God wanted them to destroy even more than they did! How can we reconcile this with the instruction to turn the other cheek? The pagan fertility cults that were rampant at the time of Joshua were so diametrically opposed to God’s will that it was best to obliterate them and to teach the world a better way.

The historic entry of the Israelites into Canaan was associated with miraculous parting of the waters, just as in the exodus from Egypt(4:23)! In later years, Elijah and Elisha are said to have crossed Jordan river in a similar way(2 Kings 2:8).

Two of the Dead Sea scrolls contain the earliest manuscripts of the book of Joshua(100 B.C.); but unfortunately the leather has been eaten away by worms, and only a few scraps are still readable.

– Judges:

The book is fascinating, because it consists largely of ancient folk tales that have been preserved really well. These colorful stories have lively details and touches of humor. We learn, for example, that Ephraimites could not pronounce the ‘sh’ in ‘shibboleth'(12:6).

Left-handness was evidently prized by the tribe of Benjamin, to which Ehud belonged, even though  Benjamin literally means ‘son of the right hand.

– Ruth:

Ruth ultimately becomes known as King David’s great-grandmother(4:17), thus an ancestor of Jesus Christ(Matthew 1:5). Jewish people read this book every year at their feast of Shavuot, 50 days after Passover. Goethe praised the beauty of this book and it is generally regarded as a magnificently crafted work of literature.

Many scholars concluded that the book was written by Ezra and Nehemiah after the exiles returned to Israel from Babylon, primarily as a reaction against prohibition of mixed marriages.(how could Ruth be David’s predecessor then?) Because of more recent linguistic arguments, scholars now tend to believe that Ruth was composed several centuries before Ezra’s time.

Verse 3:16 contains a question of Naomi to Ruth with literal meaning in Hebrew: “Who are you?”. A fragment of the Dead Sea Scrolls, found in cave number 2, surprised everybody, because it has Naomi asking: “What are you?”. Archeologist have unearthed documents using similar constructions. For example, an Ugaritic tablet says: “Baal is dead; who is he?”. This question apparently means: “What has become of him?”. Similarly, Naomi’s “Who are you” probably means: “In what condition are you now? A widow or a wife? Are you Mrs. Boaz?”

– 1Samuel:

This is the 1st half of what was originally a single, undivided book. It begins with the story of Israel’s last two judges, Eli(4:18) and Samuel(7:15); it concludes with the story of Israel’s first two kings, Saul(11:15) and David(16:1).

The Hebrew text of Samuel has the dubious distinction of being the least well-preserved of all the Old Testament sources. For some reason, the ancient scribes who copied this material made more errors than usual and the old manuscripts show considerable variation in small details. Some of the Dead Sea Scrolls have helped to resolve many of the textual riddles.

– 2Samuel:

The middle portion of the book is a literary masterpiece that contains a remarkably vivid portrait of David’s complex personality, evidently composed by an eyewitness. When confronted with his guilt, David freely confessed his wrongs and accepted the punishments. Nowhere else in the world before that time would a priest be free to accuse the king(David) of wrongdoing!

Best from Knuth’s impressive book on the Bible. [Part 1 – the Torah]

I decided to share a few excerpts from the illuminating book of Donald Knuth, who turned out to be a Lutheran. These facts and sentences are interesting by their own. The post would be most enjoying for people, who are at least partially familiar with the Bible texts, yet it may boost the interest of atheists and completely unfamiliar readers.

Genesis:

3:16 here is “…you will be filled with desire for your husband, and he will dominate you”. In other words, the woman will yearn for his husband. An almost identical construction occurs in the Hebrew text of Gen. 4:7, where we read that “sin yearns for you and you can dominate it”. In both cases the object of yearning becomes dominant. This is not a decree for the subjugation of women! Interestingly, in the same way, men are dominated by woman(Song of Solomon 7:10, 1Esdras 4:22).

Exodus:

3:16 here is the first place that the Bible mentions “leaders of Israel”. We know from later references(e.g. Gen. 24:1, Numbers 11:16) that this group included at least 70 people. They are sometimes called “elders”, the Hebrew word means literally ‘old people’; they were evidently venerable members of the community.

Here, God just interpreted his name, Jehovah, to Moses in verse 14. This verse has been the subject of much scholarly speculation. The King James Version of the Bible renders it as ‘I am that I am’, but the best translation is probably ‘I am; that is who I am'(New English Bible) or ‘I am the one who is'(New Jerusalem Bible).

Leviticus:

The book is filled with rules and regulations and thus it is found boring by many readers. Levites(the tribe of Levi) were responsible for religious ceremonies(Numbers 3:6); therefore it is not surprising that the book has been called Leviticus. However, if you examine the first 7 books of the Bible, Leviticus contains the fewest explicit references to Levi and his sons! The reason for this curious fact is that priest were selected from the descendants of Aaron, while the other Levites performed subsidiary duties.

3:16 here states: “The priest shall burn … , as a food of pleasing aroma. All of the fat belongs to God”. In Genesis, Noah made an offering after the flood waters subsided and God was moved by the pleasing smell(Gen. 8:21). We can find a similar statement in the Epic of Gilgamesh( 11:160-161) saying that God smelled something; To the people of ancient times ‘fat’ was associated with richness. The Hebrew word that means ‘fat’ in verse 16 here, means ‘best’ in Numbers 18:12 and 2Sam. 1:22!

Numbers:

The title of the book is misleading, because only 9 of its 36 chapters have a preponderance of numerical data. ‘Numbers’ is the first book in English Bibles to have a purely English title.  St Jerome’s ancient Latin translation set the standards by translating the Greek titles to those that we’re familiar with now.

Below are given the individual statistic of men of military age for each of the non-Levite tribes, as reported in chapter 1(and similarly in ch.26).  Almost every Bible scholar agrees that these numbers are impossibly large. If this was true, then the population density of Israel would be more than 3 times that of Singapore or Hong Kong. An archeological discovery of Clay tablets north of Canaan provides a possible explanation. The first number, 46500, for instance, may well mean “46 units of army, a total of 500 men”. Another interesting fact is that the first digit of each of these numbers is between 2 and 7(and never 0,1,8 or 9). The odds this to happen by chance(if we assume a uniform distribution of the numbers) are more than 200 000 to 1.

Deuteronomy:

This book completes the Pentateuch(the Torah, which is called the law of Moses(Ezra 3:2, Ezra 7:6)). Martin Luther’s translation of the Bible calls the first books 1Moses, 2Moses, and so on; Deuteronomy is 5Moses.

The famous verse Matthew 22:38, that is called “the greatest commandment” by Jesus is actually a quote from Deuteronomy(6:5). He also quoted 8:3, 6:16 and 6:13 when he was tempted in the wilderness (Matthew 4: 4,7,10).

Jacob wrestled with God near the Jabbok river(Gen.32:22-28). There is an interesting play on words relating Jacob, Jabbok and the Hebrew for ‘wrestle’, which is ‘abaq’.

The Bible doesn’t give a clear idea of how the tribes of Reuben and Gad divided the land given to them. According to Numbers 32:34-38, Reubenites occupied a cluster of cities in the middle of the region, but Joshua 13:15-28 reports a quite different distribution, with Reuben in the south. The Moabite Stone, a famous relic from 840 B.C. now in the Louvre, agrees with the Numbers account!

What to look for when trying to come up with a nice new mathematical question?

I was inspired to list a few things one may try when attempting to take a well-known object in math and consider a new and unusual, but interesting question about that object. This inspiration came from the article in Notices of the American Mathematical Society dedicated to Herbert Wilf(who passed away at 2012). There, one of the people close to him, Carla Savage from UNCS, gave a few examples of questions proposed by Wilf related to integers partitions, that are representations of a number n as a sum, i.e.

$n = x_{1} + x_{2} + \ldots + x_{k}$,  $x_{1} \geq x_{2} \geq \ldots \geq x_{k}$ .

Why is this important? Well, most of the greatest people in Combinatorics and mathematics, like Paul Erdos, Noga Alon, Herbert Wilf and many others had(and some still have it) the great ability to formulate new and interesting problems, often connected to something that many people are familiar with. This was an ability that distinguishes them and made them successful and famous. In addition, it seems to me that formulating a new problem and making a few steps towards its solution is the easiest way to get interesting results and to publish(and presumably to get lots of citations related to the newly opened lines of inquiry). Way more easier, in the majority of cases, than trying to solve old and well-known conjectures. Way more interesting and enjoying for you, as well(again, in most of the cases). Here is the list of general ideas:

• change something in the setting directly!

Values of the parameters involved, the operations involved, anything that might be reasonable!

• define edges between the nodes(the considered objects) and ask something about the obtained graph.
Example:  111111, 21111, 3111, 2211, 222, 321, 33, 42, 411 . Take a certain partition of n. Increase one of the parts by 1 and decrease another part by 1. You get a new partition of n. Can you pass through all partitions of n continuing that way further?
• Consider a bunch of objects together, to create something nice, simpler or just familiar.

Example: Is it true that for large enough n, an $n \times p(n)$ rectangle can be tiled by the Ferrers diagrams corresponding to the $p(n)$ partitions of n?

• Consider part of the object and a simple nice property of this part

Example: Given $m\geq 0$, for how many partitions of $n = x_{1} + x_{2} + \ldots + x_{k}$,  $x_{1} \geq x_{2} \geq \ldots \geq x_{k}$, we have $m = x_{2} + x_{4} + x_{6} + \ldots$?

• Consider a random object from the family and some probability. What if $n\to \infty$ ?

Example: What is the probability that a randomly chosen part size in a random partition of n has multiplicity m, when $n\to \infty$?

The answers of the questions can be found in the paper.
Herbert S. Wilf (1931–2012). Available from: https://www.researchgate.net/publication/276868136_Herbert_S_Wilf_1931-2012 [accessed May 6, 2017].

The 3:16 project on the Bible by the renown computer scientist Donald Knuth + awesome calligraphy!

I was nicely surprised, when I recently heard that one of the most famous computer scientist, prof. at Stanford, Donald Knuth, is a Christian and more concretely – Lutheran. It is very rare for a renown scientist and mathematician in particular  to speak openly about his faith. But what is so valuable in this combination and why was I happy to hear that? Well, some very smart people, who has reached high levels in certain area, can sometimes be very useful for the others by expressing their opinions on other topics that they are not considered specialist in. One such example is Garry Kasparov, former chess world champion who has been having big influence in the last years  by sharing his thoughts and criticism on the politics of his own country Russia and its president – Putin. Certainly, the questions of politics are very important and the unique perspective of such provenly smart guy could be indeed very precious.

In the case of Donald Knuth, we have a similar situation. The questions of religion and faith concern every single person on the Earth and Knuth is provenly very smart. When he finished his bachelor degree, he was directly awarded a Masters degree, too. He got his PhD in Caltech which is even today the university that it’s most hard to get into. Well known for the invention of the Tex system(which is used by almost all scientists today to write mathematical formulas), several algorithms bringing his name and the coining of the term “Analysis of Algorithms”, Knuth published two books on Christianity about 20 years ago(~2000). The main one was called  “3:16 Bible text illuminated” and the other is called “Things a Computer Scientist rarely talks about” . In the 2nd book, he mainly talks about the obstacles, enjoyment and knowledge he gained while working on the 1st book. The idea of the 3:16 project was just to analyze in details one verse from each book in the Bible using all the known commentaries of scholars, etc.. This looks silly on one hand, but on the other – the approach could give us good understanding on the whole Bible, similarly to the election polls done when a country choose a president or parliament. And indeed, this makes sense, because the Bible text is huge and very few people are able to dedicate enough time to understand well everything, though it is very important to know at least some basics. Instead of flipping coins for each of the books, Knuth simply decided to always investigate the 16th verse of the 3rd chapter in all of them. This had the same effect as the random sampling, but it’s more easy to memorize, because of the golden verse of the Bible, which is the most famous verse at all, namely John 3:16. Some of the books are too small to have 3 chapters or their 3rd chapter is too short. Nevertheless, Knuth considered all the 59 books that have such verse. For each of them, he produced exactly 4 pages – 1 page dedicated to the whole book, 1 page with the corresponding 3:16 verse itself, calligraphically written by a different world-famous calligrapher and 2 more pages with analysis of the verse and those verses related to it.

I bought both books and I will share below a few excerpts from them that I find interesting. If I have to pick a single adjective characterizing the books, it would be INSPIRING! You will see why in a few seconds…

Indeed, this post will focus on the illustrations that I liked most and they are truly inspiring:

• Knuth decided to make his own translation of each of these 59 verses, instead of using the versions in a particular Bible translation, because the question which translation is the most correct has always been very controversial. Furthermore, this additional effort allowed him to go more in depth. Below is the golden verse, written by probably the most famous calligrapher Herman Zapf. Note how the word GOSPEL is formed downwards. Honestly, I would not say this is one of the most beautiful works among illustrations, but the acrostic effect here deserves to be shown

• The next picture depicts 2 Timothy 3:16 which is saying “All Scripture inspired by God is beneficial for principles for persuasion, for correction and for education about what is right”. I’ve not heard this verse much before, but it’s really important and strongly supporting the Protestant view that everyone should study the scriptures by his own! And could I say it more accurately – for persuasion, for correction and for education.  Paraphrazing it – the self- correction(repentance also) and the truth do matter and Bible reading could help! When I saw the verse for the first time I said – Oo, finally a verse that is justifying arguing on Biblical texts 🙂 This illustration is one of my personal favorites, as well . I can’t imagine how hard is to do this solely by hand!

• The next picture was created by a Slovakian artist on Galatians. There is an original element here – additional message between the rows that clarifies something of extreme importance for the verse, namely that the last word in it, offspring, is not in singular not by chance. The particular promised successor of Abraham that Paul was talking about was Christ himself! A promise made around 2000 years before its fulfilment (most scholar considers that Abraham has lived ~2150BC ).

• Another impressive and original element was incorporated in the graphics below. The author was a Jewish caligrapher named Ismar David, who was nearly 80 years old when he was invited to do Leviticus 3:16. The choice of the artist make sense having in mind how related is Leviticus to Israel. The verse speaks about burnt offering so a related drawing created with quite interesting technique was added to the text, written in both English and Hebrew.

• Finally, I decided to include here a simple but stylish illustration, which was the favorite of Knuth and created by a person who has been an art director of National Geographic. The verse is 1Samuel 3:16. You have to know the story to realize how great this is, because Samuel is sort of saying “Here I am” very timidly; This aspect has been captured perfectly with the weights of the strokes!

I am pretty sure that many of those caligraphers were very interesting as people and characters. In fact all the people doing art are, to some extent. It seems that Knuth had have quite interesting experience when trying to communicate with a few of those guys. The last thing I selected to show here is the letter that Knuth got from one Russian artist that had the task to do 1Timothy 3:16. This guy has invented a funny and unique way to type English using a Cyrilic typewriter. I guess that those who are familiar with Cyrilic would appreciate it.

In summary, while reading the part concerning the illustrations in book 2, I saw some glipmses of creativity which were food for my own. One of the main messages of Knuth in book 2, “Things Computer Scientist Rarely Talk About”, was that aesthetic sense is crucial when we are doing any work whatsoever. He says there that instead of just getting a job done, he prefers to do his scientific work in a way that pleases him in as many senses as possible. And indeed, while reading these books I saw how meticulous is he regarding several details in the implementation of his work and that this is indeed important in many cases, especially when your goal is to demonstrate something important and when this is supposed to be read by lots of people. I have never had this quality, so I am going to use the scripture and the Knuth’s writings for correction as the verse in 2 Timothy 3:16 from the beginning of the post suggests : ]

Part 2 of the post will contain some excerpts of the same 2 books of Knuth that impressed me and that focus mainly on the Bible text.

What happens after we die? Soul Sleep or not?

All, Islam, Buddhism, and Hinduism share the belief in the survival of the soul at the death of the body([7]). The Catholic Church teaches that all those who die in God’s grace and friendship undergo purification after death, in a place called Purgatory. They also believe that friends and family members can shorten the stay of their loved ones in Purgatory, by paying for Masses, prayers, buying indulgences, etc. The general beliefs of the Eastern Orthodox Churches is very close. Most conservative Protestants believe that there are only two possible destinations for the soul after death – the glory of Heaven or the flames of hell.

On the contrary, Adventists and others( e.g. Jehovah Witnesses) believe that death is the cessation of existence of the whole person, body and soul? They also think that the righteousness death people are going to be resurrected after the Second Coming of Jesus (e.g. Soul Sleep).

I have listed here the best arguments I have heard on both sides – supporting the Soul Sleep and against the Soul Sleep doctrine:

Arguments supporting Soul Sleep:

1. [The judgement in the last days must have purpose]

2Peter 2:9 “… the Lord knows how to rescue the godly from trials and to hold the unrighteous for punishment on the day of judgment …”

But this judgement is in the last days:

2Tim. 4:1 “… Jesus Christ, who will judge the living and the dead, and in view of his appearing and his kingdom …”

If everybody go immediately either to hell or heaven, then what’s the purpose of the judgement?

2. [Even Jesus described dead as sleep]

• Our friend Lazarus sleepeth (John 11:11)
• We shall not all sleep, but we shall all be changed ( 1Cor. 15:51)
• But now is Christ risen from the death, and become the firstfruits of them that slept. …but every man in his own order: Christ the firstfruits; afterwards they that are Christ’s at his coming (1Cor. 15:20,23)
• 1Tess. 4:13-17 and many other verses in the Psalms, the book of the Kings, etc.(more than 50 times in total)

3′. [ Luke 16 is NOT literal]

This is the most cited passage[see it here] in the whole Bible when the state of the dead people’s question is concerned. Many infer from it that all that die are either in hell or heaven now. However, as a parable/proverb, this text should not be interpreted literally. Below are some arguments in support of the non-literal interpretation :

• How many people could lay in the Abraham’s side?? This symbol was used since the Greek thinking spread out among the Jews a few centuries before Jesus. The audience was able to understand this phrase.
• Are you going to be happy in heaven if you are constantly seeing how your relatives are suffering in hell? (By the way, the punishments in hell won’t be eternal!, see [1])
• How could Lazarus dip the tip of his finger in water and cool his tongue? Are we in physical bodies in hell/heaven?
• The rich man saw Abraham far away with Lazarus in his side. The abyss is huge, but they can speak to each other. Isn’t this weird?

3”. [ Gregory Nyssa about Luke 16]

As mentioned, many people in the Protestant, Catholic and even Eastern Orthodox church interpret Luke 16 as a very strong argument that our souls go to hell/heaven immediately. However, Gregory Nyssa who is one of the greatest and most widely respected scholars of the Eastern Orthodox church disagree. Even-though, he defends the literal view on the text, he says(in [2]):

• It is clear that all this will happen when we are resurrected. Just the soul cannot be judged, but only together with the body. Indeed, very often when the soul is in peace, the eye will look at something in the wrong way and will transfer this desire to the soul(through the brain).
• As Ezekiel speaks in chapter 37( verses 1-12) – the dry bones transformed into a flesh. This will happens in the last days, too.
• John 5:28,29 : “Do not be amazed at this, for a time is coming when all who are in their graves will hear his voice 29 and come out—those who have done what is good will rise to live, and those who have done what is evil will rise to be condemned.

Nevertheless, we can hardly observe any consistency in terms of exact opinions on doctrines among Eastern Orthodox scholars, so the fact that one such scholar has somewhat different opinion, should not have that much weight!

4. [Other  supporting verses]

Eccl. 9:5-6 “For the living know they shall die: but the dead know not any thing.”

Psalm 146:4 “His spirit departs, he returns to the earth; in that very day his thoughts perish.”

Psalm 115:17 “The dead do not praise the Lord, nor any who go down into silence”

Psalm 6:5 “For in death there is no remembrance of Thee; in the grave who shall give Thee thanks?”

Job 14:10-12 “But man dieth, and wasteth away: yea, man giveth up the ghost, and where is he? As the waters fail from the sea, and the flood decayeth and drieth up: So man lieth down, and riseth not till the heavens be no more, they shall not awake, nor be raised out of their sleep.”

5. [Thief on the cross]

A very important verse here and an object of many discussions is Luke 23:39-43, where
the thief on the cross next to Jesus said –  “Lord, remember me when You come into Your kingdom.” ..and Jesus said to him(according to most translations), “Assuredly, I say to you, today you will be with Me in paradise”. The same story can be found in Matthew 27:38 and Mark 15:27.

So it seems that apparently, they both will be in heaven the very same day! How Soul Sleep then?

The well-known argument here(of Adventists and others) is that the oldest existing manuscripts of the New Testament do not contain punctuation marks, so the punctuation might be different. What if the comma is after the word ‘today’, so that Jesus actually says “Assuredly, I say to you today, you will be with Me in paradise”..?

At first, this alternative phrasing looks weird.! Yet, If the comma is placed after the word today, it shows Jesus being emphatic on that day of his crucifixion, saying, today when I am dying on the cross with no apparent hope, I am promising that you will be with me in paradise eventually.

This is also consistent with the words of Jesus, when he meets Mary in the garden on the first day of the week and says, “Touch me not; for I am not yet ascended to my Father: but go to my brethren, and say unto them, I ascend unto my Father, and your Father; and to my God, and your God” (John 20:17). If the comma is inserted before the word today, Jesus would then be promising that the thief would be with Him that very day in paradise; thus making Jesus a liar.

Further very important supporting argument are the following other verses in Luke:

NIV Luke 5:26 Everyone was amazed and gave praise to God. They were filled with
awe and said, “We have seen remarkable things today.”

NIV Luke 12:28 If that is how God clothes the grass of the field, which is here
today, and tomorrow is thrown into the fire, how much more will he clothe you, O
you of little faith!

NIV Luke 13:32 He replied, “Go tell that fox,’I will drive out demons and heal
people today and tomorrow, and on the third day I will reach my goal.’

NIV Luke 13:33 In any case, I must keep going today and tomorrow and the next
day– for surely no prophet can die outside Jerusalem!

NIV Luke 19:5 When Jesus reached the spot, he looked up and said to him,
“Zacchaeus, come down immediately. I must stay at your house today.”

NIV Luke 22:34 Jesus answered, “I tell you, Peter, before the rooster crows
today, you will deny three times that you know me.”

NIV Luke 22:61 The Lord turned and looked straight at Peter. Then Peter
remembered the word the Lord had spoken to him: “Before the rooster crows today,
you will disown me three times.”

In all of them, the word today is used at the end of a clause or a sentence!

Arguments against Soul Sleep:

1. [Thief on the cross]

First, there are several other places in Luke’s gospel, where Jesus have used  “Ἀμὴν λέγω ὑμῖν” (translated as “Truly I say to you”), but he didn’t add today at the end of any of them! Those places are: 4:24, 9:27, 12:37, 12:44, 18:17, 18:29, 21:3, and 21:43

It is true that the oldest existing manuscripts of the New Testament do not contain punctuation marks, and the alternative punctuation is theoretically possible.

However, since they were both dying, there is no other time Jesus could make this statement than that very day. Furthermore, his statement is in response to a specific request, “Lord, remember me when You come into Your kingdom.” Jesus told him He would enter God’s kingdom immediately, in response to the man.

Moreover, there is, in fact, one place in Luke (verse 19:9) where the word ‘today'(SHMERON) comes at the beginning of the sentence of clause:

NIV Luke 19:9 “Jesus said to him, “Today salvation has come to this house, because this man, too, is a son of Abraham.

2. [Jesus preached To Spirits]

Scripture says that Jesus preached to the spirits in prison.

For Christ also suffered once for sins, the just for the unjust, that He might bring us to God, being put to death in the flesh but made alive by the Spirit, by whom also He went and preached to the spirits in prison (1 Peter 3:18,19).

There is no purpose in guarding unconscious spirits in prison or preaching to them. The fact that they were under restraint shows them to be conscious.

3. [Moses And Elijah]

We also have the account of Moses and Elijah appearing at the transfiguration of Jesus (Matthew 17:1-8). The impression is that they are coming from a sphere of conscious life. There is no indication that they are awakening from some dreamless sleep or a similar state. However, they both come in bodies, which might be looked as another argument that a soul without a body cannot exist, i.e. an argument supporting the Soul Sleep view.

4. [Not Permitted To Talk]

Paul clearly said that he was not permitted to relate his experience of the time he was in the presence of God.

[Paul] was caught up into Paradise and heard things that are not to be told, that no mortal is permitted to repeat (2 Corinthians 12:4).

Then, we cannot know that Lazarus, for example, hasn’t seen anything while being dead. All we know is that such an experience was not recorded in the scripture. Maybe, Lazarus was also not permitted to talk..

5. [with the Lord, away from the body]

The Bible records shows that in the New Testament, all of God’s people who are in the state of death are away from the body and present with the Lord:

NIV 2Cor. 5:8
We are confident, I say, and willing rather to be absent from the body, and to be present with the Lord.”

NIV Philippians 1:23-24
“..I desire to depart and be with Christ, which is better by far;
24 but it is more necessary for you that I remain in the body”

If Paul was asleep at death, how could he say that to die, and to be with the Lord, was far better than being in this body?

NIV 2 Corinthians 12:2-3                                                                                                                      “I know a man in Christ who fourteen years ago was caught up to the third heaven. Whether it was in the body or out of the body I do not know—God knows.And I know that this man—whether in the body or apart from the body I do not know, but God knows.”

Conclusion:

The question is quite difficult, but my personal opinion is that the arguments supporting Soul Sleep have slightly bigger weight.

Especially, if we oppose supporting argument 5) and argument 1) against, the former is dominant since putting today at the end of a clause is a phrasal pattern that Jesus uses very often! The fact that he never uses today, with the words “Truly I say to you” is not that strong, for me.

In addition, the consistency of Soul Sleep with the verses in John 20:17( where Jesus is saying he has not ascended yet) and the numerous verses in the Old Testament calling the dead “sleep” are strong arguments all together. The opinion against the Soul Sleep is not that consistent with other Biblical passages, even though the epistles of Paul and the verses there seems to corroborate this ‘against version’. Yet, we also have New Testament verses(again from Paul) like:

NIV 1 Thessalonians 5:10
“Who died for us so that whether we are awake or asleep we might live with him.”

I think that this verse shows that one can be asleep, but live with Christ at some point, which supports the Soul Sleep doctrine!

References:

2. Gregory of Nyssa, Selected works (in Bulgarian), 2011
3. Valentin Velchev, Short historical and Biblical analysis of Adventism (in Bulgarian), 2003
4. Dechko Svilenov, Life after dearch (in Bulgarian), 2012
5. http://www.whatchristianswanttoknow.com/what-is-the-doctrine-of-soul-sleep-is-it-biblical/
6. Agop Tahmisian, Main Teachings of the Bible (in Bulgarian)
7. Life after death, Samuele Bachiochi
8. https://www.blueletterbible.org/faq/don_stewart/don_stewart_127.cfm
9. https://christianity.stackexchange.com/questions/45084/what-are-the-biblical-arguments-against-soul-sleep
10. https://hermeneutics.stackexchange.com/questions/897/comma-verily-i-say-unto-thee-today-or-verily-i-say-unto-thee-today

Antiochus or the Pope?

This is a post related to the unique Seventh-Day Adventists understanding of Daniel, chapter 8(ch.11 and the whole book), the little horn mentioned in chapters 7 and 8 and the 2300  evenings and mornings from chapter 8.

Understanding of the post requires initial familiarity with this unique Adventists’ understanding, the other widely known understanding and the book itself. Shorthly said, the main point in the debate is the question what does “the little horn” in chapter 7 signifies (and those in ch.8, where the ancient texts uses a slightly different phrase). Roughly, those chapters determine the little horn as a person who will be against the God’s people and who will have huge importance for the future of the world and its end!

Most Bible scholars that study the Old testement and the book of Daniel today stand either behind the Preterist interpretation which says that the little horn is a symbol of the king of the Seleucid Empireor Antiochus IV Epiphanes(who was ruling from 175 BC until his death in 164 BC) or  behind the Futurist interpretation. which says that the little horn is an evil person in the future. However, 7 Day Adventists still support the Historicist interpretation that was also used by some of the pioneers of Protestantism (Luter, Calvin, etc.) and even Isaac Newton. It says that the little horn is simply the Pope institution.

Below are given some of the best arguments for and against the latter interpretation:

Strongest arguments of the critic:

1. [Dan.7&Dan.8 differences]

There are some important differences between the little horn from Dan.7 and Dan.8, namely:

• The world powers in Dan.7 are represented by unclean beasts, while the world powers in Dan.8 are represented by sacrificial animals .
• Dan.7 is written in Aramaic, while Dan.8 and the subsequent chapters are written in Hebrew. This could indicate that the intended audience is the Jews.
• The Aramaic word for “little horn” in 7:8 is strictly translated “another horn, a little one”, while the Hebrew wod for “little horn” in 8:9 is strictly translated “a horn from littleness”.
• The “little horn” of Daniel 7 did not have its beginning until the 4th beast was divided into 10 kingdoms, which happened in 476 AD. The “little horn” of Daniel 8 was to come up “in the latter time of their kingdom” (v. 23). “Their kingdom” refers to the four divisions of the Alexandrian Empire. The “latter time” or last days of the four kingdoms was 200 BC – 100 BC. Therefore, the little horn of Daniel 8 was to arise six centuries before the little horn of Daniel 7 existed!

2. [Rome&the sanctuary]

Rome did not have any contact with the Jewish nation until 161 BC. How could the little horn have begun its desecrating work in 457 BC, 296 years before it even came into contact with the Jewish state? Rome had no part whatsoever in the activities of 457 BC ?

Rome lived peacefully with the Jewish nation and did not even molest the Jews until after Palestine became a part of the Roman Empire in 63 BC. How could the little horn be “trampling underfoot” the Sanctuary for nearly 400 years when it never even interfered with the sanctuary service during that time period?

3. [sanctuary confusion]

How could the little horn be desecrating the Sanctuary in 457 BC when the prophecy does not even show it arising until after 301 BC?(Alexander’s kingdom was divided in 301 BC).

4. [“melek”]

The Hebrew word for “king” in verse 23 is melek, and means “a king; king, royal” (Strong’s). The word, melek, is never translated “kingdom, or world power, or empire.” Thus, the “little horn” of Daniel 8 is a king, not an Empire.

5. [little horn’s attacks directions]

According to Dan. 8:9, the horn first attacks the south, then toward the east, and en route to the east, attacks the pleasant land:

8:9 And out of one of them came forth a little horn, which waxed exceeding great, toward the south, and toward the east, and toward the pleasant land.

Rome’s greatest conquests were to the North and West of Rome. Indeed, Rome conquered large regions of northwestern Europe while  Antiochus attacked only to the South and East of Syria(on south towards Egypt and on east towards Armenia, Persia), where his kingdom was centered.

The term pleasant land is found three times in the Bible, and in each case it refers to the promised land of Israel (see Psalms 106:23-26, Jeremiah 3:18-19, Zechariah 7:7,14). Antiochus assaulted the land of Israel, killing tens of thousands of Jews, in an attempt to stamp out the Jewish religion. (see also supporting argument 4)

6. [horn out of wind or not]

The idea of a horn growing out of the wind not only seems odd, it also violates the symbol’s visual unity. Note the visionary sequence:

Goat appears with a great horn between its eyes
The Goat’s horn is broken off
In its place grow four horns
Out of one of these four horns comes another horn
All horns are still linked to the body of the goat (Greece)

Nowhere in the book of Daniel (or Revelation) do we find a horn growing in the wind detached from a body! Horns do not grow out of the wind! Horns represent kings or divisions of a kingdom. The beast represents the kingdom itself. A horn detached from a body would represent a king with no kingdom!

7. [Prince of the host]

The phrase “the prince of the host” in 8:11 refers to God himself, to the highpriest Onias or to Judas Maccabseus, according to the commentators, and does not refer to Jesus, as the Adventists claimed. In Philip. 2:9 , Paul said about Jesus that “He humbled Himself and became obedient to the point of death, even the death of the cross. Therefore God also has highly exalted Him and given Him the name which is above every name“. However, this happened far after the times of Daniel.

During Antiochus rule, the high priest, Onias, was driven into exile and later killed in the cruelest manner. Furthermore, Antiochus figuratively magnified himself to the ultimate prince of the host, God Himself. His surname Theo Antiochus declared him to be an effulgence in human form of the Divine, a god manifest in a flesh (see Edwin Bevan, The House of Seleucus, vol. 2, p. 154).

8. [1 Maccabees]

The book of Maccabees describes how the daily sacrifice was taken away, and how the sanctuary was desolated:

“And in his arrogance he went into the sanctuary and took the gold altar and the lampstand for the light, and all its furniture…” (1 Maccabees 1:21)

Antiochus’ attack on the Jewish religion was the worst crisis to face the Jews between the Babylonian captivity in 606 BC and the destruction of Jerusalem in 70 AD. After two years the situation for the sanctuary worsened:

“And they shed innocent blood all around the sanctuary, and polluted the sanctuary itself. … Her sanctuary became a desolate wilderness…” (1 Maccabees 1:37,39)

Antiochus’ goal was to destroy the Jewish religion and have all the people of Palestine unite and worship his heathen religion on penalty of death. He commanded:

“Then the king wrote to his whole kingdom that they should all become one people, and everyone should give up his particular practices. … and put a stop to whole burnt offerings and sacrifices and drink offerings at the sanctuary…” (1 Maccabees 1:41,42,45)

9. [the daynights in Genesis]

The Hebrew word for “day” (yowm or yamim for days) does not appear in the verse Dan. 8:14. The words translated “days” (`ereb boqer) literally means “evenings and mornings”(the same words as those used in Genesis, where  the right interpretation is litteral).

Since the context of the verse itself is talking about the daily sacrifices in the temple, which took place every morning and evening, thu the only reasonable conclusion is that this verse is talking about the litteral daily sacrifices in the temple. Certainly it would be reckless to apply the “year-day” principle to every prophecy where “days” are mentioned:

• Jonah prophesied Nineveh would be destroyed in 40 days (Jonah 3:4), which did not equate to 40 years.
• In Genesis 6:3 God prophesied there would be a period of 120 years before the flood, which did not equate to 43,200 years.

10. [litteral daynights]

Accepting the literal day interpretation, we can assume that God told the Jews precisely how long His sanctuary would be profaned: 2300 evening and morning sacrifices would be suspended while the sanctuary was profaned. And this makes perfect sense, having in mind the context! According to the Jewish calendar, the 2300 days works out to be six years, three months, and 18 days. This time period began on the fifteenth day of the month Cisleu, in the year 145 of the Selucidae, in which Antiochus set up the Abomination of Desolation upon the altar of God:

“Now the five and twentieth day of the month they did sacrifice upon the idol altar, which was upon the altar of God.” (1 Maccabees 1:59)

This was the beginning of a period of intense suffering for those in Israel who chose to remain faithful to God. Judas Maccabeus was outraged over the injustice done to God’s sanctuary(1 Maccabees 2:7,8,12).

Maccabeus rose up and started a revolt against Antiochus. For over three years he struggled and fought against the armies of Antiochus. Finally, he was victorious over Nicanor, on the thirteenth day of the month Adar, Anno 151, and the power of Antiochus over Judea was broken.

The Jews commemorate the triumph of Judas with an annual feast  on this day called the Feast of Dedication (Hanukkah). The Savior honored this feast by His presence (John 10:22). After his victory, when Judas entered Jerusalem, he found “the sanctuary desolate.” (1 Mac. 4:38) Judas immediately directed the sanctuary be rebuilt and cleansed so that it could be used again for sacred services (1 Mac. 4:41-51).

In addition, although the book of Maccabees is not part of the cannon, it was widely accepted by the first church as historicaly plausible!

11. [Josephus and Antiochus]

During the generation when Jesus walked the earth, the actions of Antiochus Epiphanes were still fresh in the minds of the people. They understood Antiochus Epiphanes to be the Abomination of Desolation. The Jewish historian Josephus, a contemporary of Jesus, wrote of Antiochus:

“And this desolation came to pass according to the prophecy of Daniel, which was given 408 years before; for he declared that the Macedonians would dissolve that worship [for some] time.” (Antiquities of the Jews, p. 260)

12. [The Jews and the abom.of desolation]

Even Jesus referred to the abomination in the book of Daniel (Dan. 9:27) as a warning to His followers that a similar desolation was going to happen to the Jewish nation in the future:

“When ye therefore shall see the abomination of desolation, spoken of by Daniel the prophet, stand in the holy place, (whoso readeth, let him understand:) Then let them which be in Judaea flee into the mountains.” (Matt. 24:15)

This is a clear indication that the prophency has significance mainly for the Jews, which makes the papacy interpretation less probable.

13. [Antiochus period]

In verse 23 we find that “in the latter time of their kingdom” the little horn power would arise. This refers to the latter times of the four divisions of the Greek Empire, just prior to their being conquered by Rome. The four divisions began at the battle of Ipsus in 301 BC. The kingdom of Macedonia fell in 168 BC, the kingdom of Cassander in 146 BC, the kingdom of Seleucidae (over which Antiochus ruled), fell in 65 BC, and the Ptolemy kingdom lasted until 30 BC. Since the four-fold kingdom ceased to exist when Macedonia fell in 168 BC, the prophecy calls for the appearance of the little horn shortly before this time. Antiochus reigned from 175 BC to 164 BC.

14. [on ch.8&9]

• There are around 13 years between Dan. 8:1 and Dan 9:1 (see [1]). It is hard to believe that the angel returned to explain the vision to Daniel after so many years.
• Actually, the vision in Dan. 8 was explained to Daniel. The exact Jewish translation of the last verse of ch.8 is [see http://www.chabad.org/library/bible_cdo/aid/16491 ]:
27 And I, Daniel, became broken and ill for days, but I rose and did the king’s work, and I was terrified about the vision, but no one realized it. No one realized that he was terified, but not  ‘no one understand the vision’ as it is in some translations!
• There is a clear relationship between the Jeremiah’s 70 years’ prophecy mentioned in Dan. 9:2 and the 70-weeks’ prophecy explained later in the chapter. Angel Gabriel clarifies there that the 70 years’ prophecy does not refer only to the litteral period of captivity. Leviticus 26:28 says:

28 Then I will walk contrary unto you also in fury; and I, even I,

will chastise you seven times for your sins.

Probably, this is the reason the 70-weeks in the prophecy to be given as 70 units of sevens. So, apparently, Gabriel explains the Jeremiah’s words here and not the previous prophecy which was already understood.

15. [on ch.11]

Verse 16 of ch.11 states: “”Seventh-Day Adventist Bible commentary identify Cleopatra, daughter of Ptolemy XI, as “the daughter of women” from Dan. 11:17, but this Cleopatra was not given in marriage by the king of the north to the king of the south, as the verse suggest.

Supporting arguments:

1. [the neglectable Antiochus]

The power of Antiochus Epiphanes is neglectable compared to the power of the little horn, described in both Dan.7 and Dan.8 . In chapter 7(verse 24) the little horn was described as a different from the others and more powerful then them. Similarly, in chapter 8, the verb “to become great” [gaw-dal’] was used three times in an obvious relation between them:
– In Dan. 8:4, the Persian ram was determined as “great”.

– In Dan. 8:8, the Greek goat “waxed very great”.

– The little horn  “waxed exceeding great”(Dan. 8:9).

Thus, it is highly unlikely that such a big power can be compared to the undistinguished Antiochus, whose influence has not exceeded the borders of Palestine!

2. [Relation, ch.7&ch.8]

Chapter 7 finishes with the words “As for me, Daniel, my thoughts greatly troubled me, and my countenance changed; but I kept the matter in my heart”. Chapter 8 starts with ” ..a vision appeared to me—to me, Daniel—after the one that appeared to me the first time”, probably with the aim to introduce a new vision related to the previous one.

3. [First verses, ch.7&ch.8]

Another indication of relation between those two is the the chronological emplacement of the second vision with respect to the first one. From the very first words of both chapters, 7 and 8, we see that the events happens in the first and in the third year of the reign of king Belshazzar, respectively, which is a clear evidence that a connection exists. The same method was used in the introductions of chapters 1 and 2 (the first and the third year of the Nebuchadnezzar’s kingdom), as well as in the introductions of chapters 9 and 10 (the first and in the third year of the reign of Darius).

4. [little horn, ch.7&ch.8]

There are some important similarities between the little horn from Dan.7 and Dan.8. In both chapter he is:

• clever and arrogant
• opponent of the law(see. v.7:25, 8:12). Indeed, in the expression “he cast truth down to the ground” (Dan.8:12), the word truth can be translated as “law” (see. Ps. 43:3, 119:7,43,123). Even some Jewish commentators(Ibn Ezra, Rashi) translate this verse as “he nullify the law”.
• persecute the holy people (v. 7:25, 8:24)
• appears after the beasts – kingdoms

5. [out of a wind or a horn]

The interpretation that in verse 8:9 , “the little horn” comes from one of the four winds and not from one of the four horns is more plausible, because:

• It does make sense a horn to come from the head of the animal instead of from another horn.
• When a new horn appear, this always imply the falling of previous horns/kingdoms (see Dan 7:7)
• between the phrases “toward the four winds of heaven” and “out of one of them” exists a gramatical parallelism – “winds[feminen] of heaven[masculine]” and “one[feminen] of them[masculine]”. Also, there is  a rhyme between these two phrases in the original text, which witnesses for their relationship.

6. [the goat and the ram meaning]

The fact that Daniel  speaks about only two kingdoms in this chapter, represented by two clean sacrificial animals, is not accidental. The other two kingdoms are skipped. The goal of the prophet is to focus the reader’s attention indeed on the relationship ram-goat. The book of Leviticus, chapter 16(see verse 5), speaks about the same relationship and the purpose of the whole chapter is to describe the jewish feast Yom Kippur(the day of the Atonement). Several words in chapter 8 reffer to the feast Yom Kippur: “the daily sacrifice”[הַתָּמִ֔יד], “the transgression”[בְּפָ֑שַׁע] and “the sanctuary”[מִקְדָּשֹֽׁו׃].  Even the word translated as “the prince” in verses 11 and 25 is actually a word used in Ezra 8:24, 1Chronicles 15:12, 24:25 for the high priest! Even the crucial verb “cleansed” is translated in the Septuagint with a special term used for Kippur.  Also, the pronounced Jewish commentator Rashi interpret this text as a direct refference to the day of the Atonement. The equivallent of “the judgement day” in Dan. 7:22, is the day of the Atonement here, in chapter 8. Something more, in the Ancient Israel, the day of the Atonement was an image(a prototype) of the final judgement.

7. [Parallels, ch.7&ch.8]

Both chapter 7 and chapter 8 pass trough the same stages:

Ch.7 – beasts/kingdoms – little horn – judjement
Ch.8 – beasts/kingdoms – little horn – cleansing of the sanctuary

Also, the little horn elicit the God’s judjement, stating his destruction, in both chapters (see 7:10-12 , 8:25).

8. [ch.7 , Christ and Revelation]

Jesus interpreted the “abomination of desolation” from the book of Daniel as something still in the future (see Matt. 24:15). Jesus lived after Antiochus.

The period “3 and 1/2 times”[which refers to the little horn in Dan. 7:25] is mentioned in Revelation 12:6, 12:14 and 13:5. The first of these verses says: “the woman … she might be nourished for one thousand two hundred and sixty days”[1260  = 3 and 1/2 years]. The woman always signifies the church in the biblical text!

9. [‘Maccabees’]

It is clear that Christ and the New Testament did nto regard the earlier Antiochus as fulfiling the work of the little horn, eventhough the apocryphal book of 1 Maccabees labels something Antiochus set up on the altar of the Jerusalim temple as the “abomination of desolation”(1:54).

The scholar Steven Weitzman wrote an article titled “Plotting Antiochus’s Persecution”[Journal of Biblical Literature, 2004]. Although Weitzman continues to accept the Preterist version that Antiochus IV is the little horn he supports the idea that the books of Maccabees are propaganda that fit into a longstanding ancient Near Eastern literary tradition! Indeed, it looks plausible that the Maccabees saw the little horn in Antiochus and that they paint him in especially dark colors in attempt to portray the Maccabees as saviours of Jewish religion.

10. [‘broken without hand’]

Antiochus was not “broken without hand”(verse 25), there is no suggestion of anything miraculous or mysterious about either his failure with the Jews or his death.

11. [‘understanding dark sentences’]

Antiochus was “fierce” toward the Jews, but was not noted for “understanding dark sentences” (verse 8:23).

12. [Antiochus doesn’t fit]

The 1260 days of Daniel 7 certainly do not equate with the 2300 “half days,” or 1150 “full days,” of Daniel 8 . Dr. Charles H. H. Wright, of Trinity College, Dublin and Oxford (Daniel and His Prophecies, 1906, p. 186), declared, on the 2300-day calculations of Daniel 8:

“All efforts, however, to harmonise the period, whether expounded as 2300 days or as 1150 days, with any precise historical epoch mentioned in the Books of the Maccabees or in Josephus have proved futile.”

In keeping of the symbolic nature of the prophecies of Daniel 7 and 8, we would expect the time periods in these chapters to be symbolic also.

Furthemore, additional evidence that the 2300 evenings and mornings refer to 2300 full days is present if we compare with Deuteronomy 9:25- ‘the forty days and forty nights’ or Exodus 27:20-21. In addition, in Genesis we have “And there was evening, and there was morning, one day”.

13. [connection, ch.8&ch.9]

Verse 8:16 says “And I heard a man’s voice between the banks of Ulai, which called, and said, Gabriel, make this man to understand the vision.“. However, at the end of the chapter, Daniel is still confused and do not understand it. The original manuscripts were not separated in chapters, this makes a little more plausable the adventists’ interpretation that chapter 9(more precisely – (9:24) gives us information about the beginning of the 2300 day-nights period and helps Daniel to finally understand the vision(after his long prayer in verses 9: 3-20).

14. [strong connection, ch.8&ch.9]

In support of the relationship between chapter 8 and 9 and the visions of 2300 days and 70 weeks:

– 2300 days of Dan. 8 was the only part not explained by Gabriel.
– In Dan. 9, the same angel Gabriel comes to give him “explanation and skills”. Daniel’s prayer in the chapter didn’t contain anything about explanation.
– Gabriel then points him back to the ‘mareh'(9:23), the same word as those used in ch.8 for the 2300 days vision.
– Then, in the next verse (v.24), Gabriel gives him another time prophecy(70 weeks) which is “cut-off” [the word chatak is not use anywhere else in the Bible, though it appears numeous times in the Mishnah, a Jewish Bible commentary compiled in the 1st century A.D. There, the same word means “which is cut off” in 18 out of 19 times it is used].

15. [ch.11 doesn’t fit to Antiochus]

The characteristics of this infidel king(verses 11: 31-41) are (for more detailed explanation – see source [9], with author recommended by EGW on dealing with Ch.11):
1)self-exaltation above every God
2)contempt of all religion
3)blasphemy against the true God
4)apostacy from the God of his fathers
5)disregarding the desire of womenOf all this six marks, only one, in the least, agrees with Antiochus. He was more zealous in the worship of his fathers’ Gods than any of the king before him.

16. [Dan.11:16-21 is not that plausible transition to Rome]

Adventists see a transition from Antiochus III – a Seleucidian king to Rome and Pompey, instead of the widely accepted version that this is the conquest of Palestine by Antiochus III. This would require a break in the chronological flow of the text. In addition, the verb at the beginning of v.16 indicates that the story is simply continuing. Moreover, the version that the beautiful woman in v.17 is Cleopatra Cyra – the daughter of Antiochus III, but not the Cleopatra – the daughter of Ptolemy XI makes more sense, because Antiochus III gained no advantage through this action(his daughter,Cleopatra, turned out to be loyal to Ptolemy), which doesn’t fit so well into the other version.

Instead, it is more plausible that the transition to Rome happens in verses 20-22(see [23]):

17. [origins from Porphyry]

The origin of the Antiochus’ version is generally credited, not to a Christian exegete, but to a pagan, Porphyry, who died about A.D. 304. It was devised, not to expound, but to discredit and deny the prophetic element of the book of Daniel (see [20]).

In addition, of the major weaknesses of the Maccabean thesis is that it claims that Rome cannot be found in the book of Daniel as a symbol, which is not very likely(it is at least related to Revelation where a city of seven hills is mentioned). More arguments on that can be found in [25].

18. [2300 days interpretation supported]

The idea that 2300 Years Begin jointly with seventy weeks was first introduced by the German pastor Johann P. Petri in 1768. It was also supported by more than sixty men scattered over four continents and located in twelve different countries in the early 19th century.

These included Dr. Joshua L. Wilson, moderator of the Presbyterian General Assembly; Protestant Episcopal Bishop John P. K. Henshaw, Alexander Campbell, founder of the Disciples Church, several college presidents and professors, judges, congressmen, physicians, pastors of outstanding churches, and editors of several religious journals. Even the Roman Catholic supreme court justice, Jose de Rozas of Mexico City was among them.

Nearly all of them published their expectations before William Miller’s first book appeared in Troy, New York, in 1836. (see http://www.sdanet.org/atissue/books/qod/q27.htm for details)

References:

1. Jacques Doukhan, Le Soupir de la Terre: Etude Prophetique du Livre de Daniel(Bulgarian translation)
2. Slavcho Valchanov, Interpretation of the book of the prophet Daniel
3. 101 QUESTIONS ON THE SANCTUARY AND ON ELLEN G. WHITE (http://www.lightofsalvation.com/lightofsalvation.com/Resources_files/101Questions.pdf)
4. Dale Ratzlaff, Daniel 8:14 studied in context
5. http://ellenwhiteexposed.com/2300.htm
6. http://www.2300days.com/
7. Zdravko Stefanovic, Daniel: Wisdom to the Wise. Commentary on the Book of Daniel
8. Sir Isaac Newton, Observations Upon the Prophecies of Daniel and the Apocalypse of St. John
9.  Pusey, E. B., Daniel the prophet; nine lectures, delivered in the Divinity School of the University of Oxford
11. W.Shea, Supplementary Evidence in Support of 457 B.C. as the Starting Date for the 2300 Day-Years of Daniel 8:14
12. Mitko Dimitrov, The author and the time of the book of Daniel’s writing