Music Education or Cross Platform Development

Music Education or Cross Platform Development

Pitch is commonly mistaken for being a term which is analogous to frequency; however, pitch is actually based on perception. Pitch is the human perception of the frequency of a musical note. (Heresiarch 2005) While pitch is related to the frequency, or physical rate of vibration in a sound wave, it is distinctly different.

The current standardized relationship between pitch and frequency is that the note a above middle C. sounds like a 440 Hz tone. In a musical context, the exact frequency of a note is far less important than its relationship to other notes. Numerous systems exist for defining the relationship between notes in a scale, usually involving a fixed frequency ratio between successive notes. The chromatic scale, which is most common in European music uses a frequency ratio of the twelfth root of two. Numerous other scales are in common use, with various significant intervals, but almost all make use of the octave, a frequency ratio of two. Because pitch is a human perception, and not a physical phenomenon, auditory illusions, such as a sequence of notes that increases in pitch forever are possible.

Psychoacoustics is the study of the perception of sounds. (Omegatron 2005) This area of study is very relevant to any research on sound, however when dealing with pitch recognition it is particularly important because pitch is defined by perception alone. Sound can be accurately measured by audio signal processing software, however the way in which those sound waves are actually received by the human ear and processed into thoughts in the brain is a more complex — and quite significant — study in itself. Because of the nature of sound, the signal can actually have an infinite amount of information to be processed in the mind. When the psychological factors involved in the perception of sound are ignored in favor of only the physiological processes of the ear system, important aspects of sound and human hearing are also overlooked.

Normal human hearing is between 20 Hz to 22 kHz, though with age particularly the higher end of that range decreases. Although lower frequencies may not be detected by the ears, the vibration can still be detected by the skin. The normal change in pitch that can be detected is 2 Hz; changes in pitch lower than 2 Hz may not be detectable. Ear drums are sensitive to the sound pressure variations, and the upper limit of audible sounds is generally defined by whether or not the ears will be physically harmed, and louder sounds can be withstood for shorter periods of time. “The ear can be exposed to short periods in excess of 120 dB without permanent harm, but long-term exposure to sound levels over 80 dB can cause permanent hearing loss.” (Omegatron 2005)

Masking effects are discussed in some other sections of this review of literature. Masking is the phenomenon by which louder sounds make weaker sounds inaudible. Simultaneous masking is when two sounds occur at the same time and one masks the other; this is sometimes referred to as frequency masking because sounds which are closer in frequency to a loud sound are more easily masked. Temporal masking is when a weak sound is made immediately before or immediately after a loud sound.

Psychoacoustics allow for lossy signal compression to be high quality by modeling what aspects of the audio signal can be removed or reduced and not significantly affect the perception of the sound. The psychoacoustic model allows for compression of audio files by working with the same concepts that make a particular sound seem very loud in a quiet atmosphere, however the same sound in a loud atmosphere seems very quiet. “It might seem as if this would provide little benefit to the overall compression ratio, but psychoacoustic analysis routinely leads to compressed music files that are 10 to 12 times smaller than high quality original masters with very little discernible loss in quality.” (Omegatron 2005) Further details about compressed audio formats, including MP3, OGG Vorbis, and others are discussed at length elsewhere in this literature review. These formats utilize a compression algorithm which defines which sounds are outside of the range of human hearing and marks those as low priority, sacrificing low priority sounds and strengthening the high priority sounds which will absolutely fall in the range of hearing.

Before pitch became standardized, there were very large variances in pitch. Standardization occurred as a result of a desire for different performers to perform together. Since the human mind tends to notice differences in pitch more than absolute frequency, it is problematic to have instruments used in combination that are tuned differently. The first known official standardization of pitch came in 1859, when the french government passed a law defining the a above middle C. As 435 Hz. The primary motivation came as a result of the trend at the time for orchestras to increase their pitch to achieve a “brighter” sound. The increase in pitch brought complaints from singers, who’s voices were increasingly strained by having to hit higher and higher notes.

Modern understanding of the theories and models of pitch would be nonexistent without those developed in the past. Modern music technology can only exist based on the music theories of the past. While applications of speech and tone recognition software could not possibly be fathomed by theorists in any previous era, their work is still essential to the advancements made in software, programming, and hardware today. Perhaps the shortcomings of today’s technology can be overcome drawing from the knowledge of the past. Alain de Cheveigne (2004) explores the historical perspective of pitch theories and models, from their origins with the work of Pythagoras to current developments. De Cheveigne attempts to present a complete full spectrum of tonal theories and models from throughout history in order to evaluate the full pursuit of understanding that has been offered on this subject, for as De Cheveigne so eloquently explains, “anyone who likes ideas will find many good ones in the history of science.” (De Cheveigne 2004)

Music theory has often been embraced historically as a natural part of the sciences, not as a part of the study of the arts alone. Today, much of the focus of music is kept at a far distance from science, and the closest application of music to science is to gain an understanding of how sound is perceived, in the hearing science field. Yet even this field has, historically, been focused on musical pitch, and once again music is drawn into a scientific light. “Music once constitutes a major part of Science, and theories of music were theories of the world.” (De Cheveigne 2004)

This is extremely relevant to the work at hand because of the nature of this connection between music and science that inarguably exists in the field of music technology. Although there may be controversy among the apparently conflicting models which have emerged in the past, the only controversy should be about the reluctance to incorporate all relevant ideas in order to further art and science. Education can only reach its potential for guiding students to enlightenment when approached without prejudice or narrow-mindedness; the success of technology in particular serves as a prime example of the movement to return to an inseparable coexistence of the arts and science, and provides an educational goal for enhancing the use of musical technology. Early and recent theories alike should be considered of equal value, including those which attempt to explain consonance and musical scales, the physiology of the ear, the physics of sound, and modern pitch perception models.

The first mathematical theory of musical intervals is credited to the 6th century mathematician and philosopher, Pythagoras, and his work has been monumentally important in all music theory work which followed. He was able to correspond a mathematical ratio of string length to musical intervals using a monochord. This instrument consisted of one string and three bridges, with two of the bridges stretching the string across the length of the board, and the third bridge dividing the string into two segments. “Intervals of unison, octave, fifth, and fourth arise for length ratios of 1:1, 1:2, 2:3, 3:4, respectively.” (De Cheveigne 2004) the monochord is one example of a psychophysical model. This model illustrates the connection between the perception of music, and the physical quantity of it, as expressed in the ratio.

The mystical applications of Pythagoras’s model discovery took precedence over the actual mechanics of it for many of those interested in his work, which is an example of another connection between areas of study which once co-existed but are quite separate for most modern academics: faith, art, and science. “Ratios of numbers between 1 and 4 were taken to govern both musical consonance and the relations between heavenly bodies.” (De Cheveigne 2004) the relationship between the musical and the mystical is valid, but this did mark an important point where controversy began interfering with the advancement of musical theory development.

Aristoxenos, two centuries after Pythagoras released his model, sought to discredit the standing theories held by Pythagorean devotees. In his works, he established that numbers are not relevant to music, and that music is based on perception of what one hears, not any mathematical equation. Descartes as well as Vincenzo Galilei (Galileo’s father) both also discredited the music-to-math theories that formed the revolutionary basis for Pythagoras’ music work, but not on the basis that music and numbers are unrelated. Rather, Galilei in particular figured that the tension of a string compared to the pitch made by that string should be the variables to create the sound ratio, not the length of that string. “Using weights to vary the tension of a string, he found that the above mentioned intervals arise for ratios of 1:1, 1:4, 4:9, and 9:16 respectively. These ratios as different from those found for length; they are more complex, and don’t agree with the importance that the Pythagoreans gave to numbers from 1 to 4.” (De Cheveigne 2004) the debate between followers of Pythagoras and followers of Galilei (and the many others who have contributed conflicting pitch theories) is symbolic of the questions in music theory that remains today about what precise roles, and the importance of each role, that are taken by mathematics, physics, and perception — the laws of the universe.

One can see how the development of music and pitch theories exemplifies the cyclic nature of history and science, not only because of the continued questions regarding perception and the place of the sciences. Anyone familiar with the development of music-related software may recognize the same type of dueling theories and conflicting coding and hardware ideals among developers. However, issues surrounding the development of actual software and hardware for the purpose of pitch recognition and other music and sound related purposes will be discussed later in this literature review.

Returning to the time period at hand, using the Greek concept of pitch lead to the understanding of pitch relations as consistent with the physics of sound. Galileo measured the relationship between string length and vibration frequency, however Mersenne lengthened the strings used and was therefore able to count the individual vibrations — a measurement which is today taken by use of technology that would have been inconceivable at the time, but which functions by a similar means as counting the vibrations by sight. Mersenne was therefore able to determine “the actual frequencies of each note of the scale. This provided a relation of pitch with number that was firmly grounded in the physics of sound.” (De Cheveigne 2004)

Resonance was noted by Aristotle, and it became a concept that has been used since that time in theories on hearing, the popularity due in part to the already common notion of “like by like.” (De Cheveigne 2004)

Du Verney, a theorist from the 1600s, proposed many important notes about resonance theory, and his work “concentrates several key concepts of place theory: frequency-selective response, tonotopy, and tonotopic projection to the brain.” (De Cheveigne 2004) the cochlea was compared to a steel spring in order to explain resonance, suggesting that the bony spiral lamina was the source of resonance. Others later suggested that the basilar membrane had strings like a harpsichord.

The concepts relating to superposition and Ohm’s law were difficult for many theorists to grasp until the 1700s. Mersenne, previous to that time, “reported that he could hear within the sound of a string, or a voice, up to five pitches…. He knew also that a string can respond sympathetically to higher harmonics, and yet he found it hard to accept that it could vibrate simultaneously at all those frequencies.” (De Cheveigne 2004) it was during the eighteenth century that the terms “fundamental” and “harmonic” were first used, and also the time period when the actual physics of string vibrations — multiple vibrations included — were comprehended on a relatively complete level. Linear superposition was a concept introduced by Euler, which was a particularly important finding for making the simultaneous vibrations at different frequencies comprehensible.

Earlier researchers, such as Mersenne and Galialeo, thought of vibrations as periodic, but the shape of the vibrations was not taken into consideration. Mersenne, of course, had no way to observe the shape of the periodic vibrations. This was one of the concepts that revolutionized the physics of sound in the eighteenth century, and Fourier developed a theorem regarding the superposition of sinusoids in 1820 that impacted the mathematics and physics of the time period, and in the developments of the physics of sound that would follow. Fourier’s theorem stated that a vibration might contain several sinusoidal partials, and therefore several different frequencies, depending on the shape. Ohm’s law, found in 1843 but later made far more clear, stated that every pitch corresponds to a sinusoidal partial within the stimulus waveform. “Ohm’s law extended the principle of linear superposition to the sensory domain….The sensation produced by a complex sound such as a musical note was ‘composed’ of simple sensations, each evokes by a partial. In particular, [Helmhotz, who rephrased and clarified Ohm’s work,] associated the main pitch of a musical tone to its fundamental partial.” (De Cheveigne 2004) Ohm therefore related pitch to the period of one of the sinusoidal partials, not to the period of the vibration as a whole.

The work done by Ohm and Helholtz directly contradicted work completed previously y Seebeck and others. Seebeck found that pitch does not depend on a particular partial. However, Helholtz was able to explain certain aspects of higher pitches and other circumstances that had been observed by previous researchers that did not fit into the work by Seebeck, and he believed strongly in Fourier’s theorem and could not fathom that any work which did not allow Fourier’s theorem to remain intact could be valid. The work by Ohm, Helholtz, and Seebeck have appeared in contradicting arguments in many later works, and has served to be a quandary for many researchers.

Pattern matching models assume that pitch, when the fundamental partial is missing, continues to be perceivable by the human mind because of the human ability to reconstruct patterns when a part of that pattern is missing. This answers the questions posed by many pitch researchers regarding whether or not the fundamental partial is actually the necessary correlate of pitch. If the human mind can use other parts of the pattern, such as the harmonics associated with the pitch, then the pitch may be perceivable without this necessary correlate. “This idea was prefigured by Helholtz’s ‘unconscious interference’ and…Mill’s concept of ‘possibilities.’ As a possible mechanism, Thurlow suggested that listeners use their own voice as a ‘template’ to match with incoming patterns of harmonics.” (De Cheveigne 2004) Throughout the 1900s, many researchers suggested variants on the pattern matching theories, suggesting both learned — such as Terhardt’s work which introduced the virtual pitch — and intuitive means by which the human mind perceives the pitch. Terhardt’s learned model was responded to later by the introduction of Shamma and Klein’s suggestion that exposure to noise could produce such learning; therefore, harmonic relations may be interpreted as a mathematical property which is discovered, rather than learned in specific.

Temporal models were generally less elaborate than resonance models; early temporal models suggested that pulse patterns were “handled” by the brain, rather than within the ears. The concept of strings (or other sound-producing things) which vibrated hitting the air many times and pitch therefore being a reflection of how many times the air punched forward from it hit the ear, was important to these early temporal models. It was in the fifth and fourth century BC that Democritus and Epicurus, respectively, first introduced this idea by stating that a sound-producing body actually emits atoms that are projected to the listener’s ear. Anaxagoas, in the fifth century BC, explained that hearing was “penetration of sound to the brain,” and Crotona, also in the fifth century BC, elaborated that “hearing is by means of the ears, because within them is an empty space, and this empty space resounds.” (De Cheveigne 2004)

Perhaps the most notable difference between temporal and resonance models is the amount of time which is required to make a frequency measurement. Resonance involves the build-up of energy by accumulation of successive waves. Helholtz found that notes, in music, could happen at a rate of eight notes per second; based on this time-frame, Helholtz calculated what he believed to be the narrowest bandwidth for cochlear filters. Therefore, frequency resolution was not related to cochlear filters, but rather dictated by temporal resolution. “In contrast, a time-domain mechanism needs just enough time to measure the interval between two events (plus enough time to make sure that they are not both part of a larger pattern). The time required is on the order of two periods of the lowest expected frequency.” (De Cheveigne 2004)

Temporal models can be seen to rely heavily on the definition of events. However, it is difficult, and perhaps it is even impossible, to define “event” in such a way that the measurements remain stable. The autocorrelation model, and the very similar cancellation model, are in a way an answer to the “event” question. In later developments, researchers would be able to reformulate the autocorrelation model based on actual nerve recordings. De Cheveigne found it of particular interest to compare the autocorrelation model with the string noted by many researchers whose work has been reviewed in this section. Both similarities and innate differences can be found in the function of the delay within these types of sound theorems. “Implementation of autocorrelation requires a delay, associated with a multiplier….Delayed patterns are multiplied with undelayed patterns. The strong too consists of a delay that… feeds upon itself. Delayed patterns are added to undelayed patterns…” (De Cheveigne 2004)

Today, pitch is normally (and most effectively) explained using the autocorrelation and pattern-mathcing theories.

However, even today different models are considered valid that contradict one another in many ways, however there are similar roots among even the most contradicting of models. A basic understanding of the development of theories of pitch which have been proposed may perhaps be reflective of the levels of understanding pitch that a young student first researching this topic may experience, from the very simplistic observations of a string vibrating to the ability to calculate complicated concepts using an understanding of nerve impulses.

According to Neuhoff, et al. (2002), in a study completed for the International Conference on Auditory Display, pitch perception is one of the most widely researched topics in auditory studies, having been studied for hundreds of years and resulting in thousands of scientific studies on the subject. The information and research presented in this study is of particular relevance to my own original research which is to be completed because of the focus on the ability of both musicians and non-musicians to interpret pitch change and intervals. The forthcoming methodology will go into further detail, however it is important to note at this point in the literature review that one of the main questions to which an answer will be sought is related to the incorporation of technology into the classroom and the effect that has on students’ ability to recognize pitch and tone intervals, before and after musical training with and without the aid of technology.

The study conducted by Neuhoff (2002) reveals that musical training is absolutely a factor which contributes to the ability of an individual to successfully identify pitch changes. Frequency changes have important applications beyond music alone to the scientific community; “The sonification of blood oxygen levels, geological and geophysical data in gas and oil explorations, graphical information from bivariate scatterplots, elementary mathematics instruction, historical weather patterns, and even internet traffic and performance.” (Neuhoff et al. 2002) Interpretation of variable change and sound is based on perception, and musicians have an increased ability to perceive such changes with accuracy. Musicians have been shown to have increased ability in many areas in addition to this one, such as tuning, categorization, memory, selective attention, and neurophysiological structure and function. (Neuhoff et al. 2002) This is testimony to the benefits of musical training for all students, whether or not that student will be pursuing a career in music or in any other subject.

Two distinct experiments were conducted by Neuhoff et all to identify the differences in ability between musicians and non-musicians to identify pitch change. The participants in this first study ranged from eighteen to twenty-two years old, had normal hearing abilities, and consisted of three levels of musical training: music majors with nine or more years of formal musical training, non-music majors with seven or more years of musical training, and finally non-musical majors with no significant musical training.

In the first experiment (Experiment I), these three groups were exposed to tone intervals that were unfamiliar to them and did not correspond to a standard musical scale, and attempts were made to minimize the advantage trained musicians would have based purely on previous exposure by using unfamiliar scales and timbres. The participants were asked to indicate on a slider the amount of pitch change as well as the direction of pitch change.

The results of Experiment I show that trained musicians were far more accurate in identifying pitch change in regards to both the size of the sound interval played, as well as the directions of the pitch change. Musical novices, those without significant training, often identified rising pitch changes as falling and vice versa, despite the fact that the smallest interval played was many times greater than the threshold for frequency discrimination. Although other studies have shown that musicians have increased motor ability, this ability was not a determining factor in this study. The motor skills required were very simple, just controlling a computer mouse, however motor skill ability was ruled out by repeating a similar experiment using brightness rather than tone intervals. This confirmed that the different results between musicians and non-musicians in this study were due to the ability to interpret tone intervals.

Experiment 2 conducted by Neuhoff (2002) further studied the perception of pitch change direction, and sought to explain the cause for errors in determining whether the intervals were rising or falling.


When bringing technology into a classroom, it is very important to keep this type of issue at the forefront of concerns. What would a classroom of students learn from technology that simply does not perform correctly? Given a situation where twenty or more students may be experimenting with sound processing, pitch recognition, music production, speech recognition and synthesis, and other sound-based technology, there is almost inevitably going to be background noise and sound interference. Cochannel speech — a signal that is a combination of speech from two talkers or sound sources — is likely to occur in this classroom environment, as it is in many others. Examples include any situation where a microphone is not entirely isolated or placed properly for optimized reception of the speaker’s voice, or if more than one person is speaking at the same time. Students that are attempting to evaluate how effective a given piece of software is at properly recognizing the pitch input, or how well a piece of software can recognize speech commands, may be misled about the capabilities — both actual and potential — if background noise or other student’s voices are seriously impeding the ability of the program to perform the specified function. “The presence of interference causes the quality or intelligibility of speech to denegrade. A noisy environment reduces the listener’s ability to understand what is said.” (Ma 2003)

In the study conducted by Ma, audio recordings were made of two simultaneous speakers. The intent was to have the two different voices separated by the computer, and for these two different speakers to be played back separately. This would make it possible to easily transcribe the recorded spoken track at a later time. Because of this intended purpose for the recordings, the cochannel speaker separation did not have to include continuous real-time throughput. However, because of the intent to have the recordings transcribed by humans, not by another computer program, it was necessary to focus on how well the speech was separated while maintaining a natural sound and very little interference. This process is more complicated than the similar process which will be conducted in the present study, which will focus more on the ability of software to filter out background noise to strengthen the signal of a single speaker, however the processes are strongly related.

The proper separation of the different speakers in cochannel speech can be important in any recording situation, however this is of particular importance if an automatic speech recognition system is in use. Many of the solutions to these difficulties that have been suggested in the past are inadequate or completely ineffective due to particular assumptions about the source of the interfering sound. For example, the assumption may have been made that the background noise is stationary for filtering, yet if that situation is not precisely as the programmers expected, the filtering will fail. The 1970s gave birth to new methods of separation of cochannel speech; “these methods operate on the notion that spectral harmonies of each speaker are separated exploiting the pitch estimate of the stronger talker derived from the cochannel signal.” (Ma 2003) by processing speech frame-by-frame, using a YIN pitch estimator to design the filters that identify the frequency domain of each speech signal, two simultaneous sound sources can be recovered separately, and also re-synthesised, to provide a clear digital signal.

The pitch estimation process used in cochannel speaker separation is actually very similar to the pitch estimation techniques used in single-talker pitch estimation. This is a well-documented technique that can now be applied to cochannel speech, for use in determining the pitch of the stronger speech signal in cochannel speech. The ML pitch detector, examined in previous studies, has been shown to perform better than other pitch detectors, such as the cepstral, harmonic matching, and auditory synchrony-based pitch detectors. The high regard for this pitch detector is due to its ability to perform well in noisy conditions, which is precisely the type of situation in which cochannel speech will occur. However, the ML pitch detector has a significant drawback, in that it outputs an integer pitch estimate rather than a fractional pitch estimate, which can lead to less accurate results in some circumstances. This is the reason that the YIN algorithm was chosen for the study conducted by Ma (2003). “The YIN algorithm outperforms the ML approach in the way that it outputs a fractional pitch period estimate directly, and it is also relatively a simple and effective algorithm to implement.” (Ma 2003) Compared to the ML method, the YIN method is about three times as accurate, while it remains less intensive and more applicable in real-time processing of speech than some other methods which have been suggested for use. YIN is also described as being simple, efficient, and as having a low latency, also making it, from Ma’s perspective, ideal for cochannel speech separation.

In order to process the YIN results in such a way that many errors in the pitch estimation, median-smoothing was utilized. The median of every sequence is used in the place of the actual center, which is preferable to a linear filter. After a series of median-smoothing filters are applied to the input voices, it is processed by a rule-based smoothing algorithm as well. “Combining all these post-processing methods, the YIN pitch estimator is believed to work very well in the cochannel speech separation system.” (Ma 2003)

Five features used in the study conducted by Ma within the algorithm in order to separate the cochannel speech were the energy of the signal, the zero-crossing rate of the signal (an indicator of the frequency at which the energy is concentrated in the signal spectrum), the first predictor coefficient, the energy of the prediction error, and the autocorrelation coefficient at unit sample delay.

Normally, when enhancing speech, the most significant factor taken into account is the signal-to-noise ratio (SNR). Signal to noise ratio is the ratio of useful information (signal) to useless information (noise). (Belleke et al. 2005) a commonly encountered example is static on a radio or television interfering with a program; up to a certain point, most people still enjoy the program, but when the signal to noise ratio becomes too low, they do not. Signal to noise ratio is related to, but different from the concept of dynamic range. Signal to noise ratio measurements compare the power of a representative signal to noise.

Dynamic range, on the other hand, compares the strongest possible undistorted signal to noise. Signal to noise ratio is meaningful when analyzing actual signals, such as recordings or transmissions. Dynamic range is meaningful when testing the performance of a piece of equipment, such as a speaker or a transmitter. For digital encoding, the signal to noise ratio with respect to noise added by the encoding (quantization noise) is approximately 6dB/bit. This noise is in addition to whatever signal noise was already present in the signal being encoded. Use of a higher bitrate improves signal to noise ratio at a cost of memory and storage. The appropriate bitrate must therefore be selected based on quality requirements as well as the limits of the hardware being used.

However, concentrating on the signal-to-noise ratio alone is not necessarily the most appropriate way to approach enhancing speech.

Once again, technology struggles to mimic intuitive human capabilities. Assessing speech intelligibility by measuring the speech reception threshold (SRT) can help to identify the positive and negative qualities of speech that are intuitively understood by the human mind. “A SRT is the lowest intensity and equally weighted two syllable word is understood approximately fifty percent of the time. The pure tone average and speech reception threshold should be within 7 dB of each other. Comparison of the speech reception threshold and the pure tone average serves as a check on the validity of the pure tone thresholds.” (Ma 2003)

In the conclusion of this research, Ma found that the chosen methods could be quite effective in many situations. However, there remained some shortcomings resulting perhaps from some of the inherently imperfect programming choices, discussed in some length in the forthcoming paragraphs. In the instances that two speakers simultaneously used the same pitch in speaking, or in the instances that the two speakers had overlapping harmonics, the algorithm could not separate the two individual voices, and instead processed them both as a single speaker’s input. “To solve this problem pure signal processing technique might be inadequate. A solution to this might be considered to require source-specific knowledge, but this is often impossible in realistic situation.” (Ma 2003)

In the research conducted by Ning Ma, MATLAB is the chosen tool for most of the implementation and experiments. Ma describes MATLAB as a “high-performance language for technical computing.” (Ma 2003) This chosen program, however, has many flaws, and is relevant to the present study because of these flaws. It remains essential to evaluate the research methods chosen by previous researchers when designing a new study.

MATLAB is not an appropriate tool for use in serious computer science research. It is a good user-friendly mathematical computation system, appropriate for anything from cheating on algebra homework to performing complex calculations rapidly within an engineering project. It is inappropriate for computer science research because it is closed source, and cannot easily be used as a function library from within a general-purpose programming language. Open-source tools such as Maxima, Axiom and Octave are likely to be more appropriate for computer science research.

Closed source software is inappropriate for computer science research because it introduces unknown variables; the researcher can only guess about what the software is doing, and has no way to verify that it is correct. Researchers who are also good programmers are likely to find closed-source programming tools intolerable to use because it is difficult to modify them to suit the programmer’s needs; it is unlikely that any unmodified tool will fit a good programmer’s needs perfectly. (Graham 2004)

MATLAB is not designed to work as a function library for use within any general purpose programming language. Any work done in MATLAB that is intended for use in a real application must be translated in to the language of that application. (cite that guy’s dissertation) Maxima is an open-source computer algebra system written in Common Lisp. It might be more accurate to describe Maxima as a minilanguage within Common Lisp rather than a separate application. Lisp programs can embed Maxima, and code written for Maxima can contain Common Lisp expressions. This means that a formula can be perfected using Maxima, then imported directly in to a Lisp program.

Ma does touch briefly on some of MATLAB’s shortcomings in the statement that, “MATLAB is not designed to develop a separate speech processing system, as it cannot be compiled to an independent program and is difficult to build a function library to be used by other programs.” (Ma 2003) However, the inappropriate applications of programming languages continues. Ma utilized Java for these purposes. Java is an inappropriate language for computer science research. Java is closed source, so it cannot be inspected or modified, but the technology itself may be an even worse problem.

Java was designed by Sun Microsystems with several goals in mind. One goal was to deliver compiled code that would run unmodified on several different processors and operating systems. Small Java programs called applets were intended to run inside web browsers. In order to make it possible for developers to deliver a single applet that would work no matter what operating system and browser a visitor to the site was using. Browser applets turned out to be a far less important use of Java than Sun expected; Java is used as a general-purpose programming language for both server and desktop applications, and is especially popular for cross-platform development. Another major goal of Java was to become popular for use by large corporations for their own software development, in an attempt to weaken Microsoft’s market position. (Graham 2001c) Java is currently very popular for use in computer science education, most likely as a direct result of its popularity among large corporations.

Use of Java to develop cross-platform software is still popular despite the relative unimportance of browser applets. A Java program using the standard libraries can be delivered unchanged on any supported platform, however it is common practice to make a separate package using each operating system’s native package format, for ease of installation. It is possible to create platform-specific software using Java, by making the software dependent on operating system functions or programming interfaces that are only available on the target platform, though such design decisions are actively discouraged by Sun. Doing the same with any other language requires only the additional step of recompilation, assuming only cross-platform libraries are used. For most languages, there exist cross-platform libraries suitable for common tasks. Languages that are tied closely to a given operating system, such as Microsoft’s Visual Basic are generally an exception to this, but such languages are, as a rule unsuitable for non-trivial programming tasks.

Sun’s solution to making Java programs cross-platform was to compile Java in to bytecode to be run by an interpreter known as a Java

Virtual Machine or JVM. Java bytecode could be faster than most interpreted code because program elements are only a single byte.

Bytecode is generally slower than native machine code, however, and Java is no exception. There are a number of standard libraries for Java which are also cross-platform, which guarantees that a program can be written and compiled once, then deployed without modification on any system with a Java Virtual Machine.

The use of bytecode and a virtual machine presents several problems when used with real applications. The most obvious problem is speed; native code always has the potential to be faster than bytecode. Numerous factors affect the ultimate speed of an application, but an optimal native code application will be faster than an optimal bytecode application. Another problem is that running any Java program requires starting up the virtual machine, a process that takes significantly longer than spawning a native process. The most significant problem may be Java’s unreasonable use of memory. According to a Sun internal memo, Java’s memory usage makes the system unacceptable for internal use. (Taylor)

Java is not only slow to execute code, it makes programmers slow as well. Java forces programmers to think slowly; programming languages are tools for the creation and expression of ideas, not just instructions for the computer to perform. (Abelson 1996) Java imposes an object-oriented programming paradigm, similar to that of C++. It does not allow the programmer the freedom to program any other way; attempts to do so result in poorly written object-oriented code. Despite its immense popularity, object-oriented abstractions only map cleanly on to a small number of programming problems such as filesystems and graphical user interfaces. It is certainly helpful to have object-oriented abstractions available in a language, however it is counterproductive for the language to enforce their use.

For optimal productivity, programming style should be determined as much as possible by the programmer, not the language. In larger projects, the project’s leader should set style guidelines, not rely on the limitations of the language to prevent programmers from straying from the desired style. It is instructive to compare a language that does not impose any such restrictions, instead treating not only functions defined by the programmer as equivalent to those that are built in, but changes to the syntax as well. Lisp is such a language, or to be more accurate, category of languages. Common Lisp, the Lisp dialect currently most popular for writing real applications not only permits the use of all major programming paradigms, it allows the programmer to define new ones. The Common Lisp Object System, for example is just a set of Lisp functions and macros that provide an object system; if it was not part of the language, any Lisp programmer could write something similar in to his program. Relative to Common Lisp, Java programs and Java programmers are both an order of magnitude slower. (Gat 2000)

The source of the problems with Java is that it was not designed as a tool for its designers to use; it was designed for people less intelligent than the designers to use. Like most tools designed to keep users out of trouble, Java tends to get in the way of skilled users. While lowered productivity is the most obvious problem, it is not the most serious with regard to computer science research. The use of a language that is not powerful enough does not merely hamper productivity; it actually limits what kinds of programs the programmer can write. (Graham 2002) the reason, is that programming languages are tools for thinking in. When the abstractions used by the language are not powerful enough, programmers end up having to hold extra details in their minds. Humans are quite bad at keeping track of details, relative to computers, so programming languages should do their best to allow programmers to abstract away details. (Graham 2001b) it is not possible or even desirable for any language designer to anticipate every possible abstraction a programmer might want to use, so the best alternative is to impose no limits on the ability of the programmer to define new abstractions. Java imposes such limits arbitrarily, and is therefore a poor tool in which to think.

Language designer and author Paul Graham points out an important fact about programming languages that is often missed by everyone from it managers to computer science professors: languages vary in power. (Graham 200a) Graham is not saying that a hard limit exists; he is saying only that certain languages simply cannot be used to solve certain problems. He does say that the mental limitations of programmers make it impossible to use limited languages to solve the hardest problems. When faced with a language that lacks sufficient power for the task at hand, good programmers will add features to it until it has the required power. Graham cites Phillip Greenspun’s famous “tenth rule of programming” which states that “Any sufficiently complicated C. Or Fortran program contains an ad-hoc, informally-specified bug-ridden slow implementation of half of Common Lisp.” (Greenspun)

Graham’s idea of what makes a language powerful is succinctness. Succinctness is not necessarily a matter of terse syntax, but of needing as few program elements as possible to perform a task. (Graham 2002) a program element may be anything from a function call to a conditional statement to the name of a variable. Graham considers the presence of clear patterns in code to be a sign of poor programming or a bad language; patterns are an indication that the abstractions used within the code are not powerful enough. Use of appropriate abstractions means that the programmer is having to pay too much attention to details that should be abstracted away. (Graham 2002) Succinctness is also not a matter of having libraries available to handle any task a programmer might want to accomplish; in a powerful language, the libraries themselves will also be succinct. Libraries are important to have, of course, but are not as important as the power of the language itself when developing sufficiently complex software. Succinctness is power because on average, programmers produce about the same amount of code per time unit no matter what language they use. Given a more succinct language, that code will do more. A good language will therefore allow a programmer to express as much logic as possible in a given amount of code. (Graham 2002)

Power is not just about speed; power means that the programmer can spend more time and mental energy on the problem, while spending less on the details of programming. The result is not just greater productivity, but better solutions. Graham writes in on Lisp:

Imagine the kind of conversation you would have with someone so far away that there was a transmission delay of one minute. Now imagine speaking to someone in the next room. You wouldn’t just have the same conversation faster, you would have a different kind of conversation. In Lisp, developing software is like speaking face-to-face. You can test code as you’re writing it. And instant turnaround has just as dramatic an effect on development as it does on conversation. You don’t just write the same program faster; you write a different kind of program. (Graham 1993)

Any programming problem can be theoretically solved in any language. The differences between languages become important when there is a significant difference in the quality of potential solutions; a powerful language encourages the programmer to solve bigger problems.

Graham points out the flaws in the current hype over object-oriented programming, identifying most of the reasons people like it as mistakes. Object-oriented programming allows programmers to work around the limitations of the language. The obvious solution is to use a less limited language. Object-oriented programming allows large development teams to segment code such that no member can do too much harm. Large development teams rarely produce good software. Object-oriented programming produces a lot of code, which allows programmers to convince managers that they are getting a lot of work done. This is an illusion; programmers tend to produce about the same amount of code regardless of language, so using a language that requires more code for the same task means they are actually being less productive. There is also some a connection between the amount of code and the number of errors. Object-oriented languages can sometimes be extended by users, but object-orientation is not the only way to do so. Object-oriented abstractions do have the advantage of mapping neatly on to certain problems, like user-interface design and simulations. (Graham “Why Arc…”) the problem of music education software is unlikely to contain very many sub-problems that map neatly on to object-oriented abstractions.

Large companies rarely produce great software, according to Graham. (Graham 2005) it is therefore reasonable to consider software from small companies and individuals which education administrators might otherwise reject as too risky. In the Art of Unix Programming, programmer Eric Raymond writes that programs should usually be small; that writing a large program should be done only as a last resort. Raymond is referring to partitioning the problem, not merely to code size. Raymond claims that, when possible different sections of a problem should be solved by separate programs. (Raymond 2003) Such programs may be combined in to a framework that presents users with a common interface, but should be separated for the sake of the programmer. Graham believes that programming languages should be designed to make programs short. (Graham 2001a) Keeping programs short means that the programmer can hold the entire program in his or her head, and therefore be more likely to notice the sort of context-switch that accompanies the addition of features that are not related to the original problem. The addition of such features generally signals the need to split the program up in to several parts. Large companies rarely produce small tools, probably because they are more difficult to brand and sell than large applications.

Most programs are designed, then written. This typical “top down” approach makes sense for building most physical things, but not for software. Paul Graham explains in on Lisp that a bottom up approach to programming usually works best; instead of just designing a program in an available language to solve a problem, adapt the language until it is ideally suited to solving the problem at hand. (Graham 1993,3) Since any non-trivial program is too large for a programmer to hold in his mind at once, programmers must break up programs in to smaller units. Top down style is to use subroutines; bottom up style is to define new abstractions. (Graham 1993,4) Bottom up programming makes for better programs than other methods of breaking up problems because the resulting modified language is more succinct for the type of problem being solved.

Open source is not required to produce good software, but it helps substantially. Open source makes it impossible to lie about the quality of software since anybody who wants to can inspect the code. Open source projects tend to be designed around what real users need the software to do; it is usually not influenced by pressure from management or the marketing division of some large company. The developers themselves decide what features get implemented, and usually base those decisions on their own needs. It is rare that an open source developer is not a heavy user of the project, therefore, anything that does not work well due to a coding error or design flaw is likely to be noticed and corrected much sooner than in a proprietary product.

The study conducted by Ma (2003) focused on the utilization of pitch and frequency estimation as it might be applied to speech for the purpose of speaker recognition and separation, and used to record the speech for later transcription by a human listener. However, it is important to also discuss the application of frequency estimation and pitch recognition when working with musical sounds rather than human speech. In the journal article, “Multiple Fundamental Frequency Estimation Based on Harmonicity and Spectral Smoothness” by Anssi P. Klapuri (2003), methods for estimating the frequencies of concurrent musical sounds are explained in some detail.

Pitch perception is a vital part of the functionality of human hearing in practical situations. The human mind, without any technical aid, can perceive the pitch in several sounds at one time, as well as separate one sound out from a mixture of sounds. Pitch is a perceptual attribute of sounds, and finding a way for an algorithm to mimic or surpass the human mind in pitch perception is a challenge. In the instance of single-voice speech signals, there are many possible algorithms for estimating frequency, however musical signals are most often multiple rather than single.

Applications of multiple fundamental frequency analysis include musical transcription programs, which take note of the pitch, onset time, and duration of the notes of a musical piece. In fact, the first multiple fundamental frequency analysis algorithms were designed for just this purpose. “Automatic transcription of music is seen as an important application area, implying a wide pitch range, varying tone colors, and a particular need for robustness in the presence of other harmonic and noisy sounds.” (Klapuri 2003) However, throughout the past decades this analysis has had severe shortcomings, and when dealing with simultaneous sounds a relatively large degree of transcription error was unavoidable. The only alternative to these high rates of error would be the limitation to only one particular instrument which would first have to be carefully modeled. (Klapuri 2003) Either of these options leads to a serious limitation of how the algorithm can be applied in real-life situations rather than carefully crafted research settings.

In recent years, however, there have been some advances made in this area that have significantly increased the applicability in real-life musical settings. “More recent transcription systems have recruited psychoacoustically motivated analysis principles, used sophisticated processing architectures, and extended the application area to computational auditory scene analysis in general.” (Klapuri 2003) When the first system was introduced that found the melody and bass lines in musical signals, this was a major breakthrough in this area that opened the door for many future advancements for automatic musical notation.

Multiple fundamental frequency analysis is a form of auditory scene analysis, in that the frequency is isolated without getting confused by co-occurring sounds, much like the human ability to interpret sounds which co-occur. There are, certainly, problems which are encountered during multiple fundamental frequency analysis that are not found in single fundamental frequency analysis, such as the difficulties which are introduced by the other sounds. Many algorithms function around these obstacles by breaking sound signals into smaller elements for simpler processing.

The significance of the algorithm developed by Klapuri (2003) is the ability of this particular method to resolve prominent fundamental frequencies in rich polyphonies, the ability to perform the estimation even when there is a great deal of background noise, and the ability of this algorithm to analyze non-ideal sounds. The advanced ability of this work to analyze multiple musical signals at one time while ignoring background noise is particularly important. Prior evaluation of the instruments and sound sources involved is not necessary with this algorithm, though prior knowledge of these sound sources would enhance the performance of the application. “The applications thus facilitated comprise transcription tools for musicians, transmission and storage of music in a compact form, and new ways of searching musical information.” (Klapuri 2003) This algorithm is not flawless, however it is quite notably improved in comparison to the options available previously, and the potential for use in musical education, composition, and theory is significant.

Many of the benefits of technology have a particularly important impact on those with special needs. Children and adults with developmental or learning disabilities have access, in the modern era, to assistive technology that can greatly improve the quality of life. Exposure to advanced technology that is designed to assist special needs users can offer opportunities from early stages of development to enrich and improve interactions with the rest of society. Unfortunately, many children with special needs are in homes that are already struggling to make ends meet because of the increased expenses that often accompany raising any child, particularly when that child has special needs. Therefore, it becomes the responsibility of the public school system to provide these technological opportunities to all students, including those with special needs. “For children with special needs, the world of technology offers hope and possibilities — a way to communicate and learn.” (Kahn 2004)

Among the array of sound production programs and methods which are discussed throughout this review of literature, many are of particular note in regards to applications to the special needs of mentally or physically challenged students. The most inclusive technological environment possible is very important in an educational setting. Ideally, the most “accessible” technology would also be the optimal choice for programmers and researchers, however this is not always the case. While purely open-source programs running on a platform such as Linux is ideal for research and program development, Apple Macintosh technology may, in many cases, be the most appropriate for use in a special-needs environment within a classroom.

The current Apple operating system and Apple-manufactured hardware has been made with the needs of all students, and especially special-needs students, strongly in focus. Disabled computer users may find that Macs provide the most beneficial independence of any technological options. The Universal Access design of the OSX, as well as the multitude of software and hardware specifically manufactured to be compatible with Apple’s Universal Access, makes it an optimal choice for the interactive learning needs of the classroom, even if there are some shortcomings to be found from a researcher’s perspective. Universal Access options on a Mac OSX network are designed with advanced accessibility for those with impaired vision, impaired hearing, and even impaired movement. The system also features functions which make it particularly helpful to the special needs student to participate in language and communication skill learning along with the average student of any classroom.

For visually impaired students and other computer users, Mac OSX has several options for the transmittal of information. (Universal Accessibility 2005) Information may be sent through sound or tactile methods, rather than creating a visual display on the monitor, for those with advanced visual impairment. The OSX system is configurable in many different ways to make the screen as visible and legible as possible, featuring many monitor setting options that are not available in other operating systems, or which would require a great deal of effort to make possible in other systems which do not have such options native to the operating system. There is a spoken interface called VoiceOver that also makes the system more accessible for the visually impaired. VoiceOver functions utilizing speech, audible cues, and keyboard navigation combined. (Universal Accessibility 2005) Talking alerts and many spoken items are read aloud by the computer to bring the computer user’s attention to anything in the system that requires attention. The spoken items make all on-screen texts accessible by sound, rather than by visuals alone. This function is of particular interest in the present study because of the relevance to sound processing research. The VoiceOver interface in the OSX operating system will be used in-depth in the classroom environment as an example of how other discussed sound processing discoveries can be utilized to have real-life benefit beyond the programmer’s laboratory.

The advanced nature of today’s sound technology available is integrated completely into the Mac OSX system. The Mac is capable of functioning wholly with either written or spoken communication, both from the user to the computer, and from the computer to the user. Even the simplest programs native to the Mac OSX system utilize sound technology as an important aspect of the computer experience. The calculator can be accessed with the mouse, keyboard, or with spoken commands.

Text Edit, the simple plain text editor, even has customizable speaking voice settings for reading any text aloud. (Certainly, open-source text editors have speaking options as well, and many of these sound options are available in other platforms. However, the truly wondrous view of the OSX system is due to the complete integration and all of the sound and other options as a whole.) Other sound-integration options that are simple with Mac OSX include augmentative communication devices, like electronic speaking boards and voice synthesizers, that easily function with a Mac system. (Kahn 2004)

While the VoiceOver (and related speech) functions of the OSX system are perhaps the most relevant to this study, it is certainly worth mentioning the other Universal Access features. Several special needs students are enrolled in the fifth grade classroom which will be the focus of the present study, and many of these Universal Access features will be utilized regularly or experimented with by these students to determine usefulness.

Further accessibility functions for the visually impaired user include the Zoom function, which allows the user to easily magnify everything on the screen. (Universal Accessibility 2005) for many visually impaired users, magnification of the visual output will make it accessible. The Mac OSX system utilizes a Quartz rendering and compositing engine for magnifications of up to 40x. The text, graphics, video, and other output is made larger, but unlike other systems which offer magnification functions, the Mac OSX system does this with no degradation of performance, as well as with very minimized pixilation, due to the Quartz rendering and compositing engine. This Zoom function is also highly customizable to suit the needs of each individual, with adjustable speed for the zooming and other features.

In addition to being able to adjust the zoom, the contrast levels on the display are also very customizable. Both inverting and desaturating the display are simple to change, and can make the on-screen display more accessible for the visually impaired.

Mac OSX additionally offers many features to assist the hearing impaired. These features are designed for those with both reduced hearing ability and no hearing ability; users that have difficulty with computer sounds and alerts have alternatives available to them, and there are also options for those in need of amplified sounds or the use of external sound devices. The ease of connecting new hardware to the Macintosh system for external sound input and output, of course, makes it additionally helpful in a classroom study of sound-technology where participating students will need to utilize external devices while experimenting with the sound-processing capabilities of each program. The students will not be expected to understand the software from a programmer’s perspective, although they will be invited to explore that aspect and will be exposed to the coding and more advanced aspects of the programs in use.

The QuickTime program, while having many flaws from a programmer’s perspective, is ideal for creating and displaying text tracks for closed captioning in many cases. Alerts for individual programs or system-wide alerts can both be easily set to display as a whole-screen flash in addition to or in the place of the “beep” sound effect alert that is used normally to get the attention of the computer user. An additional accessibility feature is the high-quality video conferencing of iChat AV. The performance and clarity of this video conferencing is such that students can communicate with others around the world in real-time sign language. Many students will also find the quality of the video conferencing to be high enough that lip-reading is possible as well. (Universal Accessibility 2005) SoftTTY is software which replaces the traditional TTY hardware devices. (SoftTTY 2005) SoftTTY is more optimal, in many ways, than the traditional hardware because it has capabilities like copy and paste, automatic phone book, and customizable font, text size, and color, which are not available with normal TTY. Partially because of the Universal Access capabilities of the Mac OSX system (as well as special accessibility functions added with a degree of more difficulty to the non-Mac systems used during this study), a hearing-impaired student in the focus group classroom will be able to participate in many of the sound-related study activities.

In addition to the hearing impairment and sight impairment accessibility functions of Mac OSX, there are additionally many features which make the system more usable for those with physical impairments or underdeveloped motor skills. Mac has made it simple for students with physical limitations to still use the system, even if the normal keyboard, mouse, or track pad are not usable options under normal function. One example is the Slow Keys setting, which allows the user to customize a delay of any desired time between the pressing of a key (or mouse button) and that action taking effect on the system. This helps to prevent unintentional multiple keystrokes from having an adverse effect on the system. (Universal Accessibility 2005) Another example is the Sticky Keys option. This option allows the computer user to use sequential key strokes in the place of simultaneous key strokes. This makes it possible for those users who may be able to only press one key at a time to still utilize keyboard shortcuts and command options. For example, instead of having to press the Command key and the C. key at the same time to copy selected text, with Sticky Keys activated the Command key can be pressed, let go, and then the C. key pressed, and the selected text will still copy. There are also other keyboard settings and system settings which are customizable to optimize the accessibility of the system to users with reduced motor skills or limited physical ability.

Another option available to assist the mobility-impaired computer user to navigate the system is the Speech Recognition function. Without physically touching a keyboard, mouse, or other input device, computer users can control all aspects of the computer using verbal commands. This function is another one of the Mac OSX Universal Access capabilities which not only allows all students to participate in the study, but is also relevant to the research at hand. Apple computers had speech recognition capabilities very early in the development of their home computers, and the system is now advanced and operates smoothly and easily. Further information about the Apple Speech Recognition function can be found elsewhere in this review of literature. For the physically impaired, Speech Recognition may be used in combination with software such as the X-10 Thinking Home automation software. (X-10 2005) This software allows the computer to easily connect to any portion of a home, classroom, or office and provides control over all electrical appliances. Lights, thermostat, and other appliances can be controlled from the computer, without physically going to the light switch or controls, and with Speech Recognition activated, without even physically going to the computer. This type of Speech Recognition application is another important aspect of how sound processing technology can benefit society beyond applications in the arts and sciences. Pitch and speech recognition research can make it possible for every person to lead a productive and independent life.

Further options available to make classroom activities accessible to the mobility impaired student, or to make general computer function available to all users, include the FrogPad keyboard (Frogpad 2005), which makes one-handed typing easy; an experienced single-hand keyboard user can easily keep up with other typists using traditional two-hand keyboards. (Many students and other computer users that do not have any physical limitations may also find that the one-hand keyboard is convenient and allows for easy data entry in a multitude of situations.) Computer users with significantly decreased motor ability, such as quadriplegic computer users, can also benefit from technology such as the Eye-gaze Response Interface Computer Aid, known as ERICA. (ERICA 2005) This technology allows users to control the Mac OSX system and all applications with eye movements alone, utilizing a camera and infrared light system.

Other aids which provide similar functionality include the HeadMouse Extreme (HeadMouse 2005). In order to use this device, a mark is placed on the forehead of the user, and this mark is traced by a camera as the user moves just the head to control the computer. Mac OSX has a highly functional on-screen keyboard, which can also be accessed using the VoiceOver and other speech options, and other options built into the system to make alternative mouse devices function well with the system.

Apple PlainTalk is a term which refers to a number of speech synthesis technologies and speech recognition technologies, which have been developed by Apple Computers. (McMillan 2005) PlainTalk was in development for several years, and the early 1990s were a time of increased investments and interest into speech recognition technology by Apple. Many of the best known researchers in the speech recognition field were recruited, and the result was the first PlainTalk release in 1993. Since Mac OS 7.12, PlainTalk has been a standard system component. All PowerPC Apple Computers have come with PlainTalk already installed.

PlainTalk, like all of Apple’s text-to-speech technology, uses diphones. The advantage of using diphones is that, unlike many other text-to-speech methods available, the process is not very resource-intensive and can therefore allow for better multitasking, as well as functioning properly on even lower-end or much older Macintosh computers. However, diphones do limit in many ways how natural the speech can seem, and has some other synthesis limitations. The Speech Manager interface allows Apple’s speech synthesis to be utilized in third-party applications. Even non-programmer users can easily use the control sequences available to customize the rhythm, intonation, volume, pitch, and rate of speech to suit the needs of many situations which call for the use of speech synthesis.

MacInTalk was the first Plain Talk component released. (McMillan 2005) This was a simple system extension that allowed for text-to-speech synthesis. In 1984, Apple first used MacInTalk when introducing the first Macintosh computers; the Macintosh computer was able to vocalize a greeting and actually introduce itself. There is some interesting speculation, however, that Apple Computers did not actually have access to the original source code for MacInTalk, and that it was actually developed by the SoftVoice company for the Apple II computers. While the original MacInTalk may not have been supported by Apple, MacInTalk 2 was a fully supported speech synthesis system, available for Macintosh System 6.0.7 and later. MacInTalk 3 and MacInTalk Pro-were greatly increased synthesis systems which required a 33 MHz processor, and with the introduction of the Power PC and AV Macs, it was finally useable on a wide scale. These synthesizers actually supported a full set of synthesis voices. Mac OSX has greatly enhanced synthesis voices, many of them being almost twenty times greater in size than the voices used in previous releases, due to the higher-quality diphone samples. (McMillan 2005)

In 1991, Apple had a demo of a technology referred to by the codename Casper, which was for speech recognition. Casper was eventually integrated into PlainTalk in 1993, and it was available for all PowerPC Macintosh computers, though often required a custom installation of the operating system to activate the speech recognition functions. This speech recognition is intended for use as a command option, not for dictation purposes. The speech recognition can be configured to only pick up vocal commands when a certain key is pressed, or to always listen for a keyword that would activate commands after that point. “A graphical status monitor, often in the form of an animated character, provides visual and textual feedback about listening status, available commands and actions taken. It can also communicate back with the user using speech synthesis.” (McMillan 2005) Speech recognition originally allowed users to access all menus with voice commands, however this proved to be too resource-intensive and the functions became too unreliable when used for so many commands. However, as a part of the Universal Access of OSX, the Spoken User Interface once again provides complete voice-activated access to the whole system.

Encoding formats are an important issue when dealing with any type of audio information on a computer. Uncompressed audio recordings require a large amount of storage space, such that inadequate storage can be a limiting factor despite the rapidly decreasing cost of computer hard drives. To save space, most audio files stored on computers today are compressed using any of a large variety of methods. Compression techniques can be divided in to lossless and lossy methods. Lossless compression works by using a mathematical algorithm to identify repeated patters in the data, then substitute much shorter symbols for the patterns. When the process is reversed to uncompressed the data, the resulting uncompressed data is identical to the original. Lossy audio compression attempts to remove data that is unimportant in order to save space.

Lossy audio compression is largely based on psychoacoustics, the science of analyzing how the human brain perceives sounds. Compression methods are usually some combination of removing data determined to be of minimal importance to the perception of the sound, and altering the data in order to make it more regular so that lossless substitution methods can further compress it. Commonly used techniques in lossy audio compression are removal of masked sounds, removal of sounds near or beyond the threshold of human hearing and reducing the frequency resolution of the signal. Masking is the effect that makes quiet sounds imperceptible in the presence of loud sounds; the singing of birds cannot be heard over a jet aircraft taking off nearby. This effect is known and simultaneous masking and should be familiar to anyone with normal hearing. The masking effect remains for a short period after the end of the sound, a phenomenon known as temporal masking.

A more surprising aspect of temporal masking is that a quiet sound shortly before a loud sound is also masked. The removal of masked sounds is an especially effective technique in audio compression because it not only removes a substantial amount of unimportant data, it makes the data more regular, improving the ability to further compress the data by substitution. Audio recordings may contain sound information that is beyond the range of human hearing. Sounds that are too quiet or have frequencies that are too high may be safely discarded with no noticeable effect. Frequencies below the range of human hearing are worse candidates for removal because they can often be felt even when they cannot be heard.

There are limits to the ability of humans to distinguish between similar frequencies. The limit varies between individuals, and across the range of audible frequencies, allowing a well-designed system to round frequencies that are very close together. One possible technique would be to round frequencies that are very close to standardized musical notes to the standard frequency. Such techniques can even have positive implications for use in certain types of performance. Karaoke machines are now being shipped with pitch correction, which can have positive results for the audience members at parties and karaoke bars that have to listen to less-than-perfect pitch.

The first lossy audio compression method to become widely known was MPEG audio layer 2, or MP2. MP2 was developed as a standard for digital audio broadcast by the Fraunhofer Society with funding from the European Union. Karlheinz Brandenburg and Jurgen Herre found MP2 inadequate and developed an improved version, MP3. Brandenburg famously used a recording of Suzanne Vega’s song “Tom’s Diner” to test the quality of the encoding during development. The song was selected for its softness and simplicity, as these attributes made it very easy to hear imperfections in the encoding. MP3 became the standard for audio file distribution on the Internet, and is still by far the most popular lossy audio codec among consumers despite the development of newer and more effective standards such as MPEG2 audio (AAC) and Vorbis.

In spite of the obvious advantages of lossy audio compression, there are situations in which its use is inappropriate. Lossy audio compression is potentially harmful when used on data that will be manipulated after compression. While well-designed lossy audio compression can achieve size reductions of around an order of magnitude, most sound editing operations require uncompressing the sound before making changes. Compressing the sound again results in an increased risk of audible artifacts because the compressor is attempting to replicate the sound produced by the first compression, not the original sound. After several repetitions, audible compression artifacts are guaranteed to be produced. There is even evidence that long-term exposure to lossy audio compression may result in hearing problems such as tinitus, though this is not widely accepted. (You 2002)

Lossless audio compression is far less effective at reducing space, providing only an average 50% reduction in file size. Lossless compression is still useful for editing operations, as it allows sounds to be stored in half the space without any loss of quality. In general, sound compression should be lossless, if used at all up until the point where the sound is in a final form, never to be modified or transformed in any way again. Delivery of downloaded music to consumers is an example of such an application.

Natural Language Processing, also known as NLP, is a subfield of both artificial intelligence and linguistics. Natural Language Processing studies the problems in processing and manipulation of natural language, and as a subfield of artificial intelligence, studying the ways to make computers “understand” human language. (Mokhov 2005) When early researchers experimented with language in only restricted “blocks worlds,” where only limited vocabulary was made available to the system, they became extremely optimistic about the near future of language processing in artificial intelligence. However, once systems were tested in a less restricted environment, they were found to have many shortcomings that were years from being overcome. One such system from this time of development is the SHRDLU system.

SHRDLU was developed by Terry Winograd at MIT in the late 1960s. This natural language computer program was written in Planner and Lisp. The name was derived from the arrangement of the keys on the Linotype machine: “ETAOIN SHRDLU.” An example of the language functionality of the SHRDLU demonstrates both how incredible the language understanding of the program is, as well as where the understanding falls short.


Computer: OK.

Person: GRASP the PYRAMID.


Person (changing their mind): FIND a BLOCK WHICH IS TALLER THAN the ONE YOU ARE HOLDING and PUT it INTO the BOX.


Computer: OK.




Computer: THE BOX.


Computer: FOUR of THEM.


Computer: YES, the RED CUBE. (Spinach 2005)

This was a brilliant demonstration of the potential for AI and a part of the reason that researchers were so very optimistic. SHRDLU, however, was not able to handle ambiguity and other difficulties which are inevitable in natural language processing. This program led to further important developments in language processing, however the original developer of the program considers it to be a dead-end. “Natural language understanding is sometimes referred to as an AI-complete problem, because natural language recognition seems to require extensive knowledge about the outside world and the ability to manipulate it. The definition of “understanding” is one of the major problems in natural language processing.” (Mokhov 2005)

There are many problems encountered during natural language understanding system development. One of these problems is that many sentences or phrases are not comprehendible without pervious or outside information. An example of such a situation are the following sentences about monkeys and bananas. The use of the word “they” is of particular importance here.

We gave the monkeys the bananas because they were hungry.”

We gave the monkeys the bananas because they were over-ripe.” (Mokhov 2005)

These sentences are very similar grammatically, but the word “they” refers to the monkeys in the first sentence, and to the bananas in the second sentence. In order to understand these sentences, it is necessary to know the behavior and properties of monkeys and bananas. Without that knowledge, it would be as logical to think that the bananas were hungry and that the monkeys were over-ripe.

In other situations, the same exact sentence can be interpreted in many different ways, such as the sentence “Time flies like an arrow,” in which the word “time” can be interpreted as a noun, verb, or adjective. (Mokhov 2005) in other instances, it can be impossible to distinguish to what word an adjective is meant to apply.

The major tasks in Natural Language Processing include text-to-speech, speech recognition, natural language generation, machine translation, question answering, information retrieval, information extraction, text-proofing, translation technology, and automatic summarization. (Mokhov 2005)

Some of the common problems in Natural Language Processing include word boundary detection, word sense disambiguation, syntactic ambiguity, imperfect or irregular input, and speech acts and plans. Imagine a language processing system trying to understand the famous lyrical phrase “Blue Bayou” and mistaking it for “Blew by you,” or thinking that “Excuse me while I kiss the sky” is supposed to be.”..while I kiss this guy.” These are infamous mistaken lyrics from famous songs — mistakes made by native English speakers with a working knowledge of grammar and context! Such mistakes are caused by combinations of these natural language processing problems. Such problems are sometimes addressed by using Statistical natural language processing, “uses stochastic, probabilistic and statistical methods to resolve some of the difficulties discussed above, especially those which arise because longer sentences are highly ambiguous when processed with realistic grammars, yielding thousands or millions of possible analyses.” (Mokhov 2005) the obstacles in natural language processing will be explored in the present study when experimenting with text-to-speech, speech-recognition, and other sound processing.

Word boundary detection is difficult because there are not usually any easily detectable pauses or gaps between words when spoken, in most languages. An example of ambiguous word boundaries could be a significant problems in sentences such as, “He cooked meat once,” which could be misinterpreted as “He cooked me at once.” Some languages, like Chinese, do not even have signaled word boundaries. This makes it necessary to have a working knowledge of grammar and contextual meaning. However, even a fluent, native speaker of a language may still have difficulty with word boundaries. A language processing system may misinterpret a phrase as having a different meaning than intended when listening to speech, or it might not be able to detect any logical meaning from the words without clear word boundaries.

Another problem faced by Natural Language Processing systems is word sense disambiguation. The fact that many words are synonyms — sometimes with the same spelling, other times with the same pronunciation, therefore causing problems in both textual and spoken input situations — is a problem. Puns may amuse native language speakers, however deciding what meaning makes the most sense in a given context is a difficult obstacle in language processing. Syntactic ambiguity is a similar problem; “The grammar for natural languages is not unambiguous, i.e. there are often multiple possible parse trees for a given sentence. Choosing the most appropriate one usually requires semantic and contextual information.” (Mokhov 2005)

Many of the difficulties encountered by natural language processing systems are due to inevitable user error. For example, imperfect or irregular input causes many errors. This could be caused by accents from different regions, or from speakers of a foreign language with accents. Other impediments in speech, such as a stutter, lisp, and so on, can cause problems. Additionally, when inputting textual information, errors in typing, grammar, or spelling can cause difficulties.

Speech acts also present a problem for NLP systems. A speech act is what is done or performed when something is said, for example, asking a question, describing something, making a promise, insulting or greeting someone, and so forth.

Sentences often don’t mean what they literally say; for instance a good answer to ‘Can you pass the salt’ is to pass the salt; in most contexts ‘Yes’ is not a good answer, although ‘No’ is better and ‘I’m afraid that I can’t see it’ is better yet. Or again, if a class was not offered last year, ‘The class was not offered last year’ is a better answer to the question ‘How many students failed the class last year?’ than ‘None’ is.” (Mokhov 2005)

This is an example of when one must “ask the right question” to get the right answer when dealing with AI. The system may not be able to extrapolate the intended meaning of a question or statement. The system may also not be able to determine what method of answering or responding would be most appropriate if there are many ways input may be addressed.

A large focus of this signal processing research revolves around the place of sound-related technology in the classroom and the effects found when combining this technology with education. It is important to touch on the reasons behind the choice to have the technology/education link as such an integral part of the chosen research on sound development.

The phrase “Digital Divide” refers to the disparity between the social classes regarding technology; the socioeconomic divide has a noted and distinct effect on the role of all higher technology on the younger generations.

Disparities of race, income, and age persist, leaving many to dub the digital revolution a ‘digital divide’ between haves and have-nots.” (Axtman 2001) the term Digital Divide was coined in the 1990s, and research has continued to show that minority groups and low-income families are less likely to own a computer, have access to a computer, be computer-literate, or have the many other opportunities and skills related to technology that are vital for success in this high-tech society. Research also proves that there is a strong link between computer ability and experience and the likelihood of a successful academic and professional career. Therefore it is vital that all students have access to a wide variety of computer technology, especially since these students are themselves the future of technological development. “In this educational environment, a computer is a necessity.” (Cornell 2002)

The findings of research on the Digital Divide are similar to the findings of studies from many years of examining the “Have” and “Have-Not” division of society.

Essentially, “High-income, Caucasian, married, and well-educated individuals have more access to it compared to low-income, African-American and Latino, unmarried, and less-educated individuals.” (Aemon 2004) This somewhat logical finding, that people with more money and better social resources at their disposal will be more likely to have access to things that cost a significant amount of money, is nonetheless important to understand fully. Access to basic technology has increased in the past years for all of society, however the problem remains extreme. In 1999, the United Negro College Fund found that only 34% of students at UNCF schools used the Internet, but the average of all four-year colleges in America for internet access was 78%. (Roach 1999) in the year 2000, it was determined that high-income homes in urban areas were more than 20 times as likely to have Internet Access than low-income rural homes. (Prater 2000) in the year 2001, it was found that for families with an income at or below $15,000, only thirteen percent had Internet access, while families with a $75,000 income level or higher has Internet access seventy-eight percent of the time. (Axtman 2001) by the year 2002, ninety-seven percent of teenagers from the highest income bracket families had computers, while only seventy-five percent of teenagers in lower income families had access to computers. (Aemon 2004)

The disparities are not marked by income level alone. Other social factors put students at risk for not having an adequate computer education. In 2001, one study found that “White households are two times more likely to own a computer, and nearly three times more likely to have Internet access, than Black households.” (Chappell 2001)

In the same year, the United Negro College Fund found only one sixth of college students in Black schools owned or had access to a personal computer. However, half of White students in non-Black schools not only had access to a personal computer, but actually owned one. The UNCF also revealed that significantly less than half of professors at Black schools had a computer, but professors from non-Black schools had computers seventy percent of the time. (Chappell 2001) Further research in 2004 found that Whites have higher rates of computer use overall (as well as higher rates of owning a home computer) than Latinos or African-Americans. (Aemon 2004)

Homes and colleges are not the only places that this Digital Divide is obvious. All schools, of course, are also significantly affected. The communities in which minorities and lower-income students live have fewer adults which can provide technical training, and there is less money in the community to provide for technical equipment.

Schools in low-income areas may be too preoccupied with other operating costs to realize the importance of providing students with computer and training.

If we know that for many minority and low-income children the school is the only place where they use computers, then it is only common sense to insist that all teachers have access to the technology that will help all of our children succeed.” (Feldman 2001) the key to ending the Digital Divide is held by the school alone in many communities. This social situation is relevant to the study at hand because technology must be developed in such a way that it can benefit the next generation, and also the students of today must be exposed to technology so that they are trained to be an active part of technological developments in the future.

Teachers and students must both be trained in the use of technology for the classroom and beyond. One group, the CEO Forum on Education and Technology keeps tabs on the progress of schools so far as integrating technology into the learning process, and then issues reports with recommendations for schools.

This organization’s 1999 report contained four extremely broad recommendations.

Schools of education should prepare new teachers to integrate technology effectively into the curriculum; Current teachers and administrators should be proficient in integrating technology into the curriculum; Education policymakers and school administrators should create systems that reward the integration of technology into the curriculum; and Corporations and local businesses should collaborate with the education community to help ensure that today’s students will graduate with 21st century workplace skills.” (Feldman 2001)

Students must learn the real benefits of technology. Participation in a study that shows them that there is more to computers than online gambling, video games, and downloading pornography can change the way a group of students thinks about technology. In this study, students will be emersed in real uses of technology, and they will be able to see that these programs and hardware are useful, exciting, and implementable for real-world uses, not just playing video games or browsing the web. Students must become aware of the invaluable role technology plays in all aspects of life. “Resolving the digital divide…means educating those who are not aware of technology’s benefits, designing content that is relevant to those communities, and increasing access….it’s the difference between giving a person a fish and teaching him to fish.” (Roach 2002) the positive report is that lower-class students may be less likely to have a computer, but they are more likely to use their computer for educational purposes and spend less time doing things online that are not enriching. (Eamon 2004) great deal of music-related software has been written, ranging from simple noise makers for toddlers to sophisticated systems for musical education, creation and editing. Educational software includes musical games for young children, pitch and interval recognition training systems as well as software that analyzes how music is played. At higher levels of musical education, software intended for professional music production is often used.

Entertainment software targeted toward very young children is usually mostly educational in nature. Most such software involves music to some degree, and a significant portion includes simple musical games such as identifying when the same note is played twice, or composing music using highly simplified versions of the sort of MIDI tools professionals use. These games are usually simplified versions of educational techniques used with older students.

The bulk of music software specifically intended for education consists of systems that play a note or sequence, then ask the student to identify it. The use of such tools saves a significant amount of human effort relative to more traditional pitch training, and is less prone to errors. Rarer, but perhaps more useful are systems that recognize the pitch of a note played or sung by a student, providing useful feedback for improvement. Sophisticated systems can compare a student’s performance to a reference, either a recording, or an electronic description of what a perfect performance of the work should sound like.

Software intended to convert between musical notation and sound is another tool commonly used in music education. Software that converts from notation to sound, usually using MIDI is very common, ranging from games for very young children to professional recording tools like Apple’s Garageband. Software that converts from sound to notation is much more difficult to create. Even simple pitch recognition is difficult because pitch is a human perception, not a physical phenomenon that is easily quantified by a computer. Simply measuring the frequency of a sound may not always correctly measure the pitch. In spite of the difficulty, such tools do exist, and are being continually improved.

Recording and editing tools blur the line between educational and professional software; the same software is likely to be used both by professional musicians and more advanced students. Such tools range from simple sound file editors like Audacity to sophisticated audio creation environments like Rosegarden.

There are times when no existing tools are suitable for a given purpose without modification. Many sound tools are extensible in the language they were written in. This is especially common in Lisp-based tools, which are often written using a bottom-up style, such that the majority of program is actually written in a language specifically intended for the task at hand, built on top of Lisp. It is common for such languages, as well as the internals of the program to be made available to the user. Systems written in other languages usually have to include a scripting language in order to provide similar extensibility.

Sometimes extending an existing tool is not enough. Most general-purpose programming languages have open-source sound libraries available for writing arbitrary sound-related applications. The availability of such libraries can greatly reduce the time required to write certain types of applications, and in some cases is more important than the merits of the programming language itself. The required tool can often be created by tying together a few library functions along with a graphical interface. Rapid application development tools can make the process possible even for those with minimal programming experience.

Works Cited

Annabelleke, et al. (2005, June 20) Signal-to-noise ratio. Wikipedia. Accessed online June 1, 2005 at

Axtman, K. (2001, August 22) Houston to make computing a right, not a privilege. The Christian Science Monitor. Accessed online June 1, 2005 at

Buckler, G. (2001, November 16) Digital divide creates haves and have nots. Computing Canada. Accessed online June 1, 2005 at

Chappell, K. (2001, September) UNITED NEGRO COLLEGE FUND: Crossing the Digital Divide. Ebony. Accessed online June 1, 2005 at

Dean, K. (2002, January 9) Lot to learn about school laptops. Wired News. Accessed online June 1, 2005,1383,49576,00.html?tw=wn_story_related

De Cheveigne, a. (2004) Pitch perception models – a historical review. CNRS – Ircam, Paris, France.

Eamon, M.K. (2004, June) Digital divide in computer access and use between poor and non-poor youth. Journal of Sociology and Social Welfare. Accessed online June 1, 2005 at

ERICA system. Eye Response Technology. Accessed online June 1, 2005 at

Failure Free Reading Program. Accessed online June 1, 2005 at

Feldman, S. (2001, March) the Haves and Have Nots of the Digital Divide. School Administrator. Accessed online June 1, 2005 at

FrogPad Keyboard. Accessed online June 1, 2005 at

Gat, E. (2000) Lisp as an alternative to Java. Flownet. Accessed online June 1, 2005 at

Graham, P. (2001a, May) Being Popular. Essays. Paul Graham. Accessed online June 1, 2005 at

Graham, P. (2001b, May) Five questions about language design. Essays. Accessed online June 1, 2005 at

Graham, P. (2001c, April) Java’s Cover. Essays. Accessed online June 1, 2005 at

Graham, P. (1993) on Lisp. Prentice Hall.

Graham, P. (2002, May) Succinctness is power. Essays. Paul Graham. Accessed online June 1, 2005 at

Greenspun, P. (2000) Research. Philip Grenspun. Retrieved June 1, 2005, at

HeadMouse extreme. Orin, access. Accessed online June 1, 2005 at

Heresiarch, W.E., et al. (2005, June 22) Pitch (music). Wikipedia.

Johnston, R.C., & Viadero, D. (2000, March 15). Unmet promise: Raising minority achievement. Education Week. Retrieved June 1, 2005, at

Kahn, a.B. Assistive technology for children who have Cerebreal Palsy: Augmentation communication devices. New Horizons for Learning. Inclusion. Accessed online June 1, 2005 at

Klapuri, a.P. (2003, November) Multiple fundamental frequency estimation based on harmonicity and spectral smoothness. IEEE Transactions on Speech and Audio Processing. Vol 11, No 6.

Martin, B. (2002, December 19) Cornell, local school team up on digital divide. Black Issues in Higher Education. Accessed online June 1, 2005 at

Ma, N. (2003, August) Identification and elimination of crosstalk in audio recordings.

Sheffield, Department of Computer Science, Unniversity of Sheffield.

McMillan, a, et al. (2005, April 13) Apple PlainTalk. Wikipedia. Accessed online June 1, 2005 at

Mokhov, et al. (2005) Natural Language Processing. Wikipedia. Accessed online June 15, 2005 at

Murdock, I. (2005, May 26) Open source and the commoditization of software. Ian Murdock’s Weblog. Accessed online June 1, 2005 at

Neuhoff, J.; Wayand, J.; & Knight, R. (2002, July) Pitch change, sonification, and musical expertise: Which way is up? International Conference on Auditory Display, Kyoto Japan.

Omegatron. (2005, May 8) Psychoacoustics. Wikipedia. Accessed online June 1, 2005

On-Key Karaoke. Vendor’s website. Accessed online June 1, 2005 at

Orfield, G. (2001). Schools more separate: Consequences of a decade of resegregation.Retrieved June 1, 2005, from Harvard University, the Civil Rights Project Web site:

Prater, L.F. (2000, November) Pioneers blaze trail through digital divide. Successful Farming. Accessed online June 1, 2005 at

Raymond, E.C. (2003) the art of Unix programming.

Thyrsus Enterprises. Accessed online June 1, 2005 at

Reglin, G. (1992). Ability grouping: A sorting instrument. Illinois Schools Journal. Fall, pp.43-47.

Roach, R. (1999, August 5) UNCF Examines Digital Divide on Campus. Black Issues in Higher Education. Accessed online June 1, 2005 at

Roach, R. (2002, January 3) Law professor explores digital divide, race in new book. Black Issues in Higher Education. Accessed online June 1, 2005 at

Schmitz, S. (1991) Achievement Testing – Critique of Standardized Achievement Tests. Mothering, Fall. Retrieved June 1, 2005, at

SoftTTY. Accessed online November 21, 2004 at

Spinach, R. (2005 June) SHRDLU. Wikipedia. Accessed online June 20, 2005 at

Steele, C. (1999) “Stereotype Threat” and black college students. The Atlantic Monthly, 284 (2), 44-45. Retrieved June 1, 2005 from the Atlantic Monthly Web site:

Taylor, J.S. The Java Problem. The Arc Hub. Accessed online June 1, 2005 at

Universal accesibility. Apple computers. Accessed online June 1, 2005 at

Useem, E.L. (1990). You’re good, but you’re not good enough: Tracking students out of advanced mathematics. American Educator, 14 (3), 24-27, 43-46.

Wernher. (2005, June 19) Supercomputer. Wikipedia. Accessed online June 20, 2005 at

X-10. Products, Always Thinking. Accessed online June 1, 2005 at

You. (2002) Ear Damage by MP3, DVD and Digital Television. You want to see a movie. Accessed online June 1, 2005 at

Get Professional Assignment Help Cheaply

Buy Custom Essay

Are you busy and do not have time to handle your assignment? Are you scared that your paper will not make the grade? Do you have responsibilities that may hinder you from turning in your assignment on time? Are you tired and can barely handle your assignment? Are your grades inconsistent?

Whichever your reason is, it is valid! You can get professional academic help from our service at affordable rates. We have a team of professional academic writers who can handle all your assignments.

Why Choose Our Academic Writing Service?

  • Plagiarism free papers
  • Timely delivery
  • Any deadline
  • Skilled, Experienced Native English Writers
  • Subject-relevant academic writer
  • Adherence to paper instructions
  • Ability to tackle bulk assignments
  • Reasonable prices
  • 24/7 Customer Support
  • Get superb grades consistently

Online Academic Help With Different Subjects


Students barely have time to read. We got you! Have your literature essay or book review written without having the hassle of reading the book. You can get your literature paper custom-written for you by our literature specialists.


Do you struggle with finance? No need to torture yourself if finance is not your cup of tea. You can order your finance paper from our academic writing service and get 100% original work from competent finance experts.

Computer science

Computer science is a tough subject. Fortunately, our computer science experts are up to the match. No need to stress and have sleepless nights. Our academic writers will tackle all your computer science assignments and deliver them on time. Let us handle all your python, java, ruby, JavaScript, php , C+ assignments!


While psychology may be an interesting subject, you may lack sufficient time to handle your assignments. Don’t despair; by using our academic writing service, you can be assured of perfect grades. Moreover, your grades will be consistent.


Engineering is quite a demanding subject. Students face a lot of pressure and barely have enough time to do what they love to do. Our academic writing service got you covered! Our engineering specialists follow the paper instructions and ensure timely delivery of the paper.


In the nursing course, you may have difficulties with literature reviews, annotated bibliographies, critical essays, and other assignments. Our nursing assignment writers will offer you professional nursing paper help at low prices.


Truth be told, sociology papers can be quite exhausting. Our academic writing service relieves you of fatigue, pressure, and stress. You can relax and have peace of mind as our academic writers handle your sociology assignment.


We take pride in having some of the best business writers in the industry. Our business writers have a lot of experience in the field. They are reliable, and you can be assured of a high-grade paper. They are able to handle business papers of any subject, length, deadline, and difficulty!


We boast of having some of the most experienced statistics experts in the industry. Our statistics experts have diverse skills, expertise, and knowledge to handle any kind of assignment. They have access to all kinds of software to get your assignment done.


Writing a law essay may prove to be an insurmountable obstacle, especially when you need to know the peculiarities of the legislative framework. Take advantage of our top-notch law specialists and get superb grades and 100% satisfaction.

What discipline/subjects do you deal in?

We have highlighted some of the most popular subjects we handle above. Those are just a tip of the iceberg. We deal in all academic disciplines since our writers are as diverse. They have been drawn from across all disciplines, and orders are assigned to those writers believed to be the best in the field. In a nutshell, there is no task we cannot handle; all you need to do is place your order with us. As long as your instructions are clear, just trust we shall deliver irrespective of the discipline.

Are your writers competent enough to handle my paper?

Our essay writers are graduates with bachelor's, masters, Ph.D., and doctorate degrees in various subjects. The minimum requirement to be an essay writer with our essay writing service is to have a college degree. All our academic writers have a minimum of two years of academic writing. We have a stringent recruitment process to ensure that we get only the most competent essay writers in the industry. We also ensure that the writers are handsomely compensated for their value. The majority of our writers are native English speakers. As such, the fluency of language and grammar is impeccable.

What if I don’t like the paper?

There is a very low likelihood that you won’t like the paper.

Reasons being:

  • When assigning your order, we match the paper’s discipline with the writer’s field/specialization. Since all our writers are graduates, we match the paper’s subject with the field the writer studied. For instance, if it’s a nursing paper, only a nursing graduate and writer will handle it. Furthermore, all our writers have academic writing experience and top-notch research skills.
  • We have a quality assurance that reviews the paper before it gets to you. As such, we ensure that you get a paper that meets the required standard and will most definitely make the grade.

In the event that you don’t like your paper:

  • The writer will revise the paper up to your pleasing. You have unlimited revisions. You simply need to highlight what specifically you don’t like about the paper, and the writer will make the amendments. The paper will be revised until you are satisfied. Revisions are free of charge
  • We will have a different writer write the paper from scratch.
  • Last resort, if the above does not work, we will refund your money.

Will the professor find out I didn’t write the paper myself?

Not at all. All papers are written from scratch. There is no way your tutor or instructor will realize that you did not write the paper yourself. In fact, we recommend using our assignment help services for consistent results.

What if the paper is plagiarized?

We check all papers for plagiarism before we submit them. We use powerful plagiarism checking software such as SafeAssign, LopesWrite, and Turnitin. We also upload the plagiarism report so that you can review it. We understand that plagiarism is academic suicide. We would not take the risk of submitting plagiarized work and jeopardize your academic journey. Furthermore, we do not sell or use prewritten papers, and each paper is written from scratch.

When will I get my paper?

You determine when you get the paper by setting the deadline when placing the order. All papers are delivered within the deadline. We are well aware that we operate in a time-sensitive industry. As such, we have laid out strategies to ensure that the client receives the paper on time and they never miss the deadline. We understand that papers that are submitted late have some points deducted. We do not want you to miss any points due to late submission. We work on beating deadlines by huge margins in order to ensure that you have ample time to review the paper before you submit it.

Will anyone find out that I used your services?

We have a privacy and confidentiality policy that guides our work. We NEVER share any customer information with third parties. Noone will ever know that you used our assignment help services. It’s only between you and us. We are bound by our policies to protect the customer’s identity and information. All your information, such as your names, phone number, email, order information, and so on, are protected. We have robust security systems that ensure that your data is protected. Hacking our systems is close to impossible, and it has never happened.

How our Assignment  Help Service Works

1.      Place an order

You fill all the paper instructions in the order form. Make sure you include all the helpful materials so that our academic writers can deliver the perfect paper. It will also help to eliminate unnecessary revisions.

2.      Pay for the order

Proceed to pay for the paper so that it can be assigned to one of our expert academic writers. The paper subject is matched with the writer’s area of specialization.

3.      Track the progress

You communicate with the writer and know about the progress of the paper. The client can ask the writer for drafts of the paper. The client can upload extra material and include additional instructions from the lecturer. Receive a paper.

4.      Download the paper

The paper is sent to your email and uploaded to your personal account. You also get a plagiarism report attached to your paper.

smile and order essay smile and order essay PLACE THIS ORDER OR A SIMILAR ORDER WITH US TODAY AND GET A PERFECT SCORE!!!

order custom essay paper