Enroll Course

100% Online Study
Web & Video Lectures
Earn Diploma Certificate
Access to Job Openings
Access to CV Builder

Online Certification Courses

Information Technology Fundamentals – Data Types And Units

Information Technology Fundamentals – Data Types and Units. 

Data Types and Units

A Data Type is a grouping or collection of data values. This can be done for a variety of reasons, including likeness, convenience, or to direct attention. Complex definitions are frequently easier to comprehend when they are organized well. Almost all programming languages explicitly include the concept of data type, although the viable data types are sometimes constrained by simplicity, computability, or regularity considerations. The conceptual organization provided by data types should not be underestimated, despite the fact that an explicit data type declaration often enables the compiler to choose an efficient machine representation.

Different languages may employ distinct data types or types with identical names but distinct semantics. In the Python programming language, for instance, the int data type represents an arbitrary-precision integer that supports standard numeric operations such as addition, subtraction, and multiplication. In the Java programming language, however, the int data type represents the range of 32-bit integers from 2,147,483,648 to 2,147,483,647, with arithmetic operations that wrap on overflow. In Rust, this 32-bit integer type is denoted by i32 and, in debug mode, panics on overflow.

The majority of programming languages also permit the programmer to construct new data types, typically by merging various pieces of existing data types and defining the valid operations of the new data type. For instance, a programmer could develop a new data type entitled "complex number" with both real and imaginary components, and a colour data type represented by three bytes showing the amounts of red, green, and blue, as well as a string containing the color's name.

Data types are utilised within type systems, which provide a variety of methods for defining, implementing, and employing them. A data type is a constraint placed on the interpretation of data in a type system, characterising the representation, interpretation, and structure of values or objects stored in computer memory. The type system employs data type information to validate the correctness of programmes that access or manipulate data. A compiler may use the static type of a value to optimise the amount of storage required and the selection of operations on the value. In accordance with the IEEE specification for single-precision floating point numbers, many C compilers render the float data type, for example, with 32 bits. They will therefore employ microprocessor operations particular to floating-point on these values (floating-point addition, multiplication, etc.).


Parnas, Shore, and Weiss (1976) identified five definitions of "type" used in the literature, sometimes implicitly:

  • Syntactic: A type is merely a syntactic term linked with a stated variable. Although these definitions are important for complex type systems such as substructural type systems, they provide no obvious meaning for the types.
  • Representation: A type is defined by its composition of more fundamental kinds, typically machine types.
  • Representation and behaviour: A type is defined by its representation and the collection of operators used to manipulate it.
  • Value space: A type is a collection of values that a variable may contain. These definitions allow us to discuss (disjoint) unions and Cartesian products of types.
  • Value space and behaviour: A type is a collection of values that a variable can hold and a collection of functions that can be applied to these values.

Definitions in terms of a representation were frequently employed in imperative programming languages such as ALGOL and Pascal, but definitions in terms of a value space and behaviour were utilized in higher-level programming languages such as Simula and CLU. Object-oriented models fit more closely with types that incorporate behaviour, whereas structured programming models typically do not include code and are referred to as simple data structures.

Classification of data types

Data types can be categorized based on a number of factors:

Primitive data types or built-in data types are kinds that are integral to the implementation of a programming language. Non-primitive data types are user-defined data types. Java's numeric types, for instance, are primitive, whereas classes are user-defined.

A value of an atomic type is a unit of data that cannot be broken down into its constituent pieces. A value of a composite or aggregate type is a collection of independently accessible data components. Despite consisting of a succession of bits, an integer is typically considered atomic, whereas an array of integers is certainly composite.

Basic data types are defined axiomatically from fundamental concepts or by enumerating their elements. In terms of other data types, generated or derived data types are specified and partially defined. All fundamental types are atomic. In mathematics, integers are a fundamental type, but an array of integers is the outcome of applying an array type generator to the integer type.

The nomenclature varies; primitive, built-in, basic, atomic, and fundamental may be used interchangeably in the literature.

Notable data types

Machine data types

All data in digital electronics-based computers is represented as bits (0 and 1) at the most fundamental level. Typically, the smallest addressable unit of data is a set of bits known as a byte (usually an octet, which is 8 bits). A word is the unit processed by machine code instructions (as of 2011, typically 32 or 64 bits).

Machine data types disclose or make available fine-grained hardware control, yet this can expose implementation details that reduce the portability of the code. Consequently, machine types are predominantly employed in systems programming and low-level programming languages. The majority of data types in higher-level programming languages lack a language-defined machine representation. C, for example, provides types such as booleans, integers, floating-point numbers, etc., but the exact bit representations of these kinds are implementation-defined. The char type, which represents a byte, is the only C type with a precise machine representation.

Boolean type

The Boolean data type represents the true and false values. Although only two values are conceivable, they are often represented as a word rather than a single bit since storing and retrieving a single bit requires more machine instructions. Many programming languages lack an explicit Boolean data type, instead employing an integer data type and interpreting 0 as false and other values as true. The logical structure of how the language is translated into machine language is referred to as Boolean data. In this situation, a Boolean 0 corresponds to the logic False. True is always a non-zero value, namely one, known as Boolean 1.

Numeric types

Virtually all computer languages provide at least one integer data type. Either they may provide a small number of preset subtypes bound to specific ranges (such as short and long and their unsigned counterparts in C/C++) or they may permit users to freely specify subranges such as 1...12 (e.g. Pascal/Ada). If a comparable native type does not exist on the target platform, the compiler will convert them to code using existing types. If a 32-bit integer is requested on a 16-bit platform, for example, the compiler will implicitly interpret it as an array of two 16-bit integers.

Floating point data types represent specific fractional values (rational numbers, mathematically). Although their maximum values and precision are predetermined, they are frequently incorrectly referred to be reals (evocative of mathematical real numbers). Typically, they are stored internally in the form a 2b (where a and b are integers), but are shown in decimal form.

Fixed point data types facilitate the representation of monetary values. They are frequently implemented internally as integers, resulting in restrictions that are predefined.

For architectural independence, a Bignum or arbitrary precision numeric type may be provided. This represents an integer or rational with a precision that is limited only by the system's memory and computing resources. Bignum implementations of arithmetic operations on machine-sized values are considerably slower than their machine counterparts.


The enumerated type has discrete values that can be compared and assigned, but do not necessarily have a tangible representation in the computer's memory; compilers and interpreters can arbitrarily encode them. The four suits in a deck of playing cards, for instance, may be represented by four enumerators named CLUB, DIAMOND, HEART, and SPADE that belong to an enumerated type named suit. If a variable V is declared with suit as its data type, any of these four values may be assigned to it. Some implementations permit programmers to give integer values to enumeration values, or even treat them as if they were integers.

String and text types

Strings are a series of characters used to store words or plain text, and are most commonly text markup languages expressing formatted content. Characters can be alphabetic letters, numbers, blank spaces, punctuation marks, etc. Characters are extracted from a character set like ASCII. Depending on the character encoding, character and string types may have distinct subclasses. The initial 7-bit wide ASCII was deemed inadequate and has since been replaced with 8, 16 and 32-bit sets, which may represent a vast array of non-Latin alphabets (such as Hebrew and Chinese) and other symbols. Some programming languages support both variable-length and fixed-length strings for strings. They can also be classified based on their maximal size.

Since most character sets include the digits, it is possible to have a string consisting of numbers, such as "1234" These numeric strings are often distinguished from numeric values such as 1234, although some programming languages translate between them automatically.

Union type

A union type description specifies which subtypes may be stored in its instances, such as "float or long integer." In contrast to a record, which can simultaneously contain a float and an integer, a union can only have a single subtype.

For increased type safety, a tagged union (also known as a variation, variant record, discriminated union, or disjoint union) has an additional field identifying its current type.

Algebraic data types

An algebraic data type (ADT) is a potentially recursive product type sum type. A value of an ADT consists of a function Object() { [native code] } tag and zero or more field values, with the function Object() { [native code] } dictating the number and type of field values. The set of all possible values of an ADT is the disjoint set-theoretic union (sum) of the sets of all possible values of its variations (product of fields). Pattern matching is used to investigate the values of algebraic types by identifying a value's function Object() { [native code] } and extracting the fields it includes.

The ADT refers to a product type similar to a tuple or record if there is only one function Object() { [native code] }. An empty function Object() { [native code] } equates to an empty product (unit type). If no fields are present in any of the constructors, then the ADT corresponds to an enumerated type.

One common ADT is the option type, defined in Haskell as data Maybe a = Nothing | Just a 

Data structures

Data structures are kinds that are highly useful for storing and retrieving data. Common data structures include:

An array (also known as a vector, list, or sequence) stores and offers random access to a number of elements. Array elements are often (but not always) required to be of the same data type. Arrays may have a fixed or extensible length. Indices into an array are normally needed to be integers from a certain range (if not, one may emphasise this relaxation by referring to an associative array) (if not all indices in that range correspond to elements, it may be a sparse array).

Record (sometimes called tuple or struct) (also called tuple or struct) Records are one of the most elementary data structures. A record is a value that contains other values, often in a predetermined number and sequence and indexed by names. Typically, the elements of records are referred to as fields or members.

An object comprises a number of data fields, similar to a record, as well as a number of subroutines called methods for accessing and updating them.

the singly linked list, which can be used to create a queue and is defined in Haskell as the ADT data List a = Nil | Cons a (List a), and the singly linked list are both types of linked lists.

The binary tree, which enables quick searching, can be defined in Haskell as the ADT data structure. BTree a = 0 | Node (BTree a) a (BTree a]

Pointers and references

The most important non-composable, derived type is the pointer, a data type whose value directly refers to (or "pointers to") another value stored elsewhere in computer memory using its address. It is an archaic form of reference. (In common parlance, a page number in a book could be regarded a reference to another piece of data.) Pointers are frequently stored in a manner similar to an integer; nevertheless, an application would crash if it attempted to dereference or "look up" a pointer whose value was never a valid memory address. In order to mitigate this potential issue, pointers are considered a distinct type from the data type to which they point, even though the underlying representation is identical.

Function types

Functional programming languages treat functions as a distinct datatype and permit the storage of values of this type in variables and the passing of values of this kind to functions. Some multi-paradigm programming languages, such as JavaScript, have capabilities for treating functions as data. [15] Most modern type systems go beyond JavaScript's simple "function object" type and have a family of function types separated by argument and return types, such as the type Int -> Bool, which denotes functions accepting an integer and returning a boolean. A function is not a first-class data type in C, but programmers can handle function pointers. Java and C++ initially lacked function values, but C++11 and Java 8 added them.

First-order logic in mathematical logic does not permit the use of quantifiers on function or predicate names, but second-order logic permits.

Type constructors

A type function Object() { [native code] } creates new types from existing ones and can be viewed as an operator that accepts zero or more types as input and returns a type. Type constructors can be created for product types, function types, power types, and list types.

Quantified types

Existentially and universally quantified types are based on predicate logic. Universal quantification is stated as or forall x. f x and is the intersection over all types x of the body f x, i.e., the value is of type f x for each and every x. Existential quantification denoted as or exists x.

Existential types must be represented by translating exists a. f a to forall r. (forall a. f a -> r) -> r or a comparable type in Haskell.

Refinement types

A refinement type is a type equipped with a predicate that is assumed to hold for any refined type element. For instance, the notation for natural integers bigger than five is

Dependent types

The definition of a dependent type depends on a value. Dependent functions and dependent pairs are two frequent instances of dependent types. The return type of a dependent function may depend on the value of one of its inputs, not only its type. A dependent pair may contain a second value whose type is dependent on the type of the first item.

Meta types

Certain computer languages display type information as data, allowing for type introspection and reflection. Higher order type systems, on the other hand, although permitting types to be built from other types and supplied as function arguments, often avoid basing computational decisions on them.

Convenience types

High-level languages and databases may provide "real world" data types, such as times, dates, and monetary values, for convenience (currency). These may be implemented as composite types in a library or as built-in types of the language.

Corporate Training for Business Growth and School