Near-duplicate features of C++ (2017)
source link: https://www.tuicool.com/articles/QfqyMn3
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Near-duplicate features of C++
A large collection of features within the C++ programming language have very similar functionality. Much of this is due to features being inherited straight from C, then new native C++ features being added as alternatives.
Knowing which feature to use in what situation can be a continual learning struggle. Each team might use a different subset of features, which can cause needless confusion and stylistic variation. As new features get added to the C++ language or as awareness of existing features increases, it’s sometimes necessary to revisit old codebases and revise them to comply with modern practices.
Finally, although this state of affairs is an unchangeable fact about C++, other major programming languages have far less duplication. Don’t automatically assume that C++’s situation is desirable, necessary, or inevitable.
Abstract feature Features in C Features in C++- Implementation: c
- Header: h
- Implementation: cpp, cc, cxx, c++, C
- Header: hpp, hh, hxx, h++, H
#ifdef #pragma once(No new feature in C++)
0 NULL
-
nullptr
int int_least16_t(No new feature in C++)
x &= y ^ ~z; x and_eq y xor compl z;(No new feature in C++)
- Pointer (
*
)
- Reference (
&
)
- Pass by address (
foo(&x)
)
- Pass by reference (
foo(x)
)
-
typedef const Foo (*Bar)[4];
-
using Bar = const Foo(*)[4];
-
(Type)val
Type(val) Type{val} xxx_cast<Type>(val)
-
Foo x = y;
Foo x(y); Foo x{y};
-
int foo(void) { ... }
-
int foo() { ... }
Foo *x = new Foo; Foo *x = new Foo(); Foo *x = new Foo{};
- Varargs (stdarg.h)
- Default function argument
- Function overloading
-
int main() { ... }
-
auto main() -> int { ... }
-
struct
-
class
- Struct initializer
- Field initializer
- Constructor field initializer
- Constructor assignment statement
-
static
private namespace(Unavailable in C)
namespace class
- Preprocessor macro
- Template function/class
template <class T> template <typename T>
- Function pointer
- Virtual method
- Lambda expression
- Macro
constexpr(Unavailable in C)
- Define in class
- Define separately
malloc() calloc()
new new[]
-
free()
delete delete[]
- Array of
char
-
std::string
T[] T*
std::vector<T> std::array<T>
- Integer index
- Iterator
-
memset()
-
std::fill()
memcpy() memmove()
std::copy() std::copy_backward()C standard library header
-
#include <foobar.h>
-
#include <cfoobar>
#include <stdio.h> printf("%d", n); scanf("%d", &n);
#include <iostream> std::cout << n; std::cin >> n;Random number generation
#include <stdlib.h> srand() rand() RAND_MAX
#include <random>
setjmp() longjmp()
try { ... } catch (T v) { ... } throw val;File extension
Many pieces of C source code (but not all) can be compiled in C++ mode without modifications – so .c is a valid file extension for C++ files. Many pure-C++ header files still use a .h extension (same extension as C header files) instead of a C++-specific extension.
As for the C++-specific file extensions, generally .cpp and .cc can be found in the wild. The other alternatives are rare. Unlike other major languages, there is no standardization on file extensions. Surprisingly, C doesn’t suffer from this because C files are universally named as .c or .h.
There are at least two ways to ensure that any given header file is included at most once. The standards-compliant, but cumbersome and brittle way is to use an #ifdef
+ #define
+ body + #endif
construct. It requires 3 lines of code, and the defined constant needs to be manually synchronized with the file name. The convenient and widely supported (but technically non-standard) way is to simply write #pragma once
at the top. If the chosen C/C++ compiler doesn’t support this, it’s not hard to write a script that replaces every header file’s #pragma once
with an auto-generated #ifdef
guard.
The only null pointer value in C is 0
, and NULL
is simply a macro constant defined as 0
. The nullptr
keyword introduced in C++11 is much more type-safe and less ambiguous in overloads. Always use nullptr
instead of the old NULL
or 0
.
The C and C++ standards make a number of guarantees on the bit widths of basic integer types, such as: short
and int
are at least 16 bits, long
is at least 32 bits, width( char
) ≤ width( short
) ≤ width( int
), et cetera. C99 and C++11 introduce the stdint.h header, which defines explicitly sized types like int_least16_t
. Because the simple int
type is already guaranteed to have at least 16 bits, we might as well use it instead of the fancier type name.
The C language uses characters such as |
and ~
, and can support non-ASCII character sets. Some characters used in the language are absent from certain character sets, so alternate spellings of certain operators and tokens were added to the language. In C, these synonyms are activated by including iso646.h, whereas in C++ these synonyms are a mandatory part of the language. One consequence is that you cannot name a variable or function as and
, or
, not
, etc. Other consequences are that the feature can lead to style disagreements or can be abused for code obfuscation.
(More info: cppreference.com: Alternative operator representations )
Example using a pointer:
int x; int *ptr = &x; *ptr = 2;
Example using a reference:
int x; int &ref = x; ref = 2;
Both pieces of example code above behave identically. Internally, the reference is implemented as a pointer. Some key differences are that a reference is never nullptr
, a reference cannot be redefined ( reseating ), and a reference cannot be indexed/subscripted – a pointer can do all three of these things, but the functionality is often unneeded. References are essentially restricted pointers, and don’t really add new features (except possibly for checking nullptr
at the time of assignment instead at the time of reading/writing the value). References do reduce the syntactic burden where you continually write *
to dereference a pointer. References tend to be more useful and idiomatic in C++ than pointers, but pointers are indispensable for some tasks still.
Passing a raw value by reference requires no symbol at the call site, whereas passing by pointer does. While this is convenient, it can easily hide the fact that another function can change the value of a variable even though no function has a pointer to the variable.
C++11 introduces a new way to create type aliases. The new way uses a different keyword, and the ordering of the tokens is arguably more natural, especially for complex types such as arrays and functions.
C++ exploded the number of ways to convert between types. For primitive types, if the old C cast of (Type)val
is valid, then the constructor notations (officially called function-style casts) of Type(val)
(all C++ versions) and Type{val}
(since C++11) are valid too. For example:
bool a = (...); int b = (...); long c = (...); float d = (...); char e = char(a); // From bool short f = short(b); // Front int typedef long long LL; // Need this for multi-token types long long g = LL(c); // Can't just write: long long(c) double h = double{d}; // Introduced in C++11
The various language-level cast operators cover conversions on integers, constness, primitive pointers, object pointers, etc.: static_cast
, const_cast
, reinterpret_cast
, dynamic_cast
.
For structs and classes, a unary constructor without the explicit
designation can be used as an implicit cast:
class Foo { public: Foo(int x) {} explicit Foo(char *y) {} }; int a = (...); char *b = (...); Foo c = a; // OK Foo d = b; // Compile-time error
A variable can be initialized in 3 possible ways, with different semantics with respect to which constructor is called, the assignment operator, and variable-length lists:
Foo x = w; // C style Foo y(w); // C++ style Foo z{w}; // C++11 and above
In C++, these two constructs are synonyms, and the simpler form with ()
is preferred over (void)
. In C, the form with (void)
means that the function must take no arguments, whereas the form with ()
has complicated semantics that can lead to subtle errors; hence the form with (void)
is strongly recommended in C.
(More info: Stack Overflow: Is there a difference between foo(void) and foo() in C++ or C? , Stack Overflow: Is it better to use C void arguments “void foo(void)” or not “void foo()”? )
When creating an object on the heap with new
and calling a zero-argument constructor, there are 3 possible notations, with the last two being semantically equivalent:
Foo *u = new Foo; Foo *v = new Foo(); Foo *w = new Foo{};
When creating an object on the stack and calling a zero-argument constructor, the parentheses option is not available because that would declare a function prototype instead:
Foo x; // OK Foo y(); // Different meaning Foo z{}; // OK
Now consider these class definitions:
// POD (plain old data) type class A { public: int i; }; // Non-POD type, and compiler provides default constructor class B { public: int i; ~B() {} }; // Explicit constructor without initialization class C { public: int i; C() {} }; // Explicit constructor with initialization class D { public: int i; D() { i=1; } };
If we create an object of each type without parentheses/braces (e.g. A *p = new A;
), then:
-
an object of type
A
will havei
uninitialized. -
an object of type
B
will havei
uninitialized. -
an object of type
C
will have theC()
constructor called andi
uninitialized. -
an object of type
D
will have theD()
constructor called andi
initialized to1
.
Whereas if we create an object of each type with parentheses/braces (e.g. A *p = new A();
):
-
an object of type
A
will havei
default-initialized to0
. -
an object of type
B
will havei
default-initialized to0
. -
an object of type
C
will have theC()
constructor called andi
uninitialized. -
an object of type
D
will have theD()
constructor called andi
initialized to1
.
As we can see, the parentheses/braces are optional when the target type has a default constructor explicitly defined. Otherwise, the parentheses/braces will force default initialization.
(More info: Stack Overflow: Do the parentheses after the type name make a difference with new? )
A function can be declared with default argument values for optional parameters:
int foo(int bar=0) { ... } print(foo()); // Equivalent to print(foo(0))
However, the above construct is a special case of the more general and powerful mechanism of function overloading:
int foo() { return foo(0); } int foo(int bar) { ... } print(foo()); // Calls the top definition, which leads to foo(0)
By comparison, Python only has default arguments, and Java only has method overloading.
The classic C syntax (also adopted in C++, C#, D, Java, etc.) places the return type in front of the function name:
int main(...) { ... }
C++11 allows the keyword auto
as a dummy return type, then have the actual return type declared after the argument list and an arrow:
auto main(...) -> int { ... }
The functional benefit of this style is that the trailing return syntax allows the return type to depend on the arguments.
The trailing style could aid readability. Perhaps because of this, many new languages like Scala, Go, Rust, Swift, etc. declare functions in this way.
structs
and classes
can both contain the same things (fields, constructors, methods, nested classes, etc.) and can have parent classes, but they differ with respect to default visibility level and possibly other subtle characteristics. The cleanest approach is to use a struct
if it contains only fields and no other members, and a class
when constructors and methods are needed.
The fields of a struct
or class
can be initialized in a few possible places:
class Foo { int x = 0; int y; int z; Foo () : y(1) { z = 2; } };
The constructor’s initializer list (between the colon and opening brace) is mandatory for variables with a reference type or a type without a default constructor.
Note that Java suffers from three choices too, with two of them being syntactically identical to C++:
class Bar { int x = 0; int y; int z; { // Instance initializer block (rarely used) y = 1; } public Bar() { z = 2; } }
Members outside of classes can be confined to the compilation unit by adding static
to the declaration:
static int counter = 0; static void func() { ... }
Members outside of classes can also be confined to the compilation unit by putting them inside an anonymous namespace:
namespace { int counter = 0; void func() { ... } }
Members inside classes/structs are hidden with the private
access modifier:
class Test { private: static int counter = 0; private: static void func() { ... } };
Global-ish variables and functions can be placed inside a namespace or as static members inside a class:
namespace Alpha { int gamma; void delta(); } class Beta { static int gamma; static void delta(); }; // Same usage syntax print(Alpha::gamma); print(Alpha::delta()); print(Beta::gamma); print(Beta::delta());
Some forms of generic code are expressible using C preprocessor macros:
#define MAX(x, y) ((x) >= (y) ? (x) : (y))
But C++ templates are far more type-safe and powerful:
template <typename T> T max(T x, T y) { return x >= y ? x : y; }
A template with type parameters can be specified with class
(old style, discouraged) or typename
(modern style).
Function pointers are one way to convey a variable function (this comes from C):
int foo() { ... } int bar() { ... } int (*chosen)() = choice ? foo : bar; print(chosen());
Objects with virtual methods are another way to convey a variable function (and this is the only way in Java):
class Base { virtual int doIt(); }; class Foo : Base { virtual int doIt() { ... } }; class Bar : Base { virtual int doIt() { ... } }; Base *chosen = choice ? new Foo() : new Bar(); print(chosen->doIt());
Lambda expressions (introduced in C++11) provide a new way to convey a variable function:
auto foo = []() { return 0; }; auto bar = []() { return 1; }; int (*chosen)() = choice ? foo : bar;
C and C++ provide ways to define complicated expressions/functions such that the compiler evaluates their ultimate value at compile time (instead of at run time). The mechanism provided by C is macro functions, which is limiting and brittle. In C++, specializations of templates can be used to achieve compile-time evaluation of values, but this is cumbersome. C++11 adds the constexpr
keyword and rules for what compile-time-evaluable functions are allowed to do.
A class can have its methods defined in the class declaration, for example:
class Foo { void bar() { ... do stuff ... } int qux() { ... do stuff ... } };
If the class is part of a generic template, then the above format is mandatory. Also, this format is mandatory in newer languages like Java, C#, etc. – there is no concept of a function prototype.
Otherwise, the class can be declared with a bunch of empty method prototypes:
class Foo { void bar(); int qux(); };
Subsequently, the method definitions are placed in a .cpp file:
#include "Foo.hpp" void Foo::bar() { ... do stuff ... } int Foo::qux() { ... do stuff ... }
The advantage of defining methods in the class is that it reduces duplication, which makes code reading and refactoring easier. The advantage of defining separately is that it allows separate compilation and parallel builds.
The idiomatic C++ way to allocate an object on the heap is to use the new
operator:
class Foo { ... }; Foo *x = new Foo; Foo *y = new Foo[10];
An alternative way that allows lower level control is to use malloc()
(from C) and manually call placement- new
:
Foo *x = (Foo*)malloc(sizeof(Foo)); new (x) Foo; Foo *y = (Foo*)malloc(10 * sizeof(Foo)); new (&y[0]) Foo; new (&y[1]) Foo; (... et cetera ...)
When a heap object is allocated with malloc()
(both scalars and arrays), simply call free()
on the pointer.
When a single heap object is allocated with new
, it must be released with delete
. But an array of heap objects like ptr = new Type[n]
must be released with delete[] ptr
. The distinction between delete
and delete[]
must be carefully respected, or elseundefined behavior occurs.
Raw C strings are popular in C++ but cumbersome when it comes to memory allocation:
#include <string.h> char buffer[100] = "Hello"; // Need to set size strcat(buffer, " world"); // Need to avoid overrun
C++ provides a string library that handles memory allocation under the hood:
#include <string> std::string str("Hello"); str += " world"; const char *cstr = str.c_str(); // Easy conversion
C has arrays (supported natively in the language) and linked lists (supported manually through structs). C++ adds safer, more powerful, and more convenient implementations of the sequence ADT , primarily std::vector
, std::array
, and std::list
(linked list).
An array is accessed by an integer index:
int *a = (...); int index = 5; print(a[index]);
A vector can be accessed by index or iterator:
std::vector<int> b = (...); print(b[index]); // No bounds checking print(b.at(index)); // Bounds-checked std::vector<int>::iterator it = b.begin(); ++it; print(*it); // Same as b[1]
C only has the memset()
function to fill a block of memory with a repeated char
-sized value. It is mainly useful for setting to zero, or occasionally to 0xFF
. It cannot fill a multi-byte value or work with specific struct fields. However, this simplicity and narrow scope makes it relatively easy to have an assembly-optimized implementation in the standard library.
The std::fill()
function in C++ is essentially a loop that performs a value assignment on each element within a range. This means it works on types of any size, and also calls the appropriate constructor (with possible computations and side effects).
The C way to copy an array of values is to call the memcpy()
or memmove()
function. This is also appropriate in C++ for arrays of numbers and simple structs.
The C++ way to copy a sequence of values is to call the std::copy()
or std::copy_backward()
function. Choosing which function to use is only relevant if the input and output ranges overlap; otherwise std::copy()
is fine. Compared to memcpy()
, the function std::copy()
also works on std::vector
and other container types with iterators, and will properly call the (possibly overridden) type assignment operator to set the destination values.
Almost all C++ code depends on features of the C standard library (which are a part of C++). Including a C standard library header file can be done in one of two ways:
#include <stdfoo.h> // Old (compatible with C) #include <cstdfoo> // New (pure C++)
These ways are almost equivalent except for the subtle matter of namespacing. The first way guarantees that members will be available in the global namespace, e.g. size_t
and printf()
. The second way guarantees that members will be available in the std
namespace, e.g. std::size_t
and std::printf()
. (Preprocessor macros have no namespace and are always global.) This means it is technically a mistake to #include <cstdint>
and use the type uint32_t
, because the type needs the std::
prefix. However, most compilers make both the global name and the std
-namespaced name available, which masks this subtle error.
The C way of doing I/O is through FILE*
handles, fread()
and fwrite()
, and printf()
and scanf()
functions with format strings and variable-length arguments. Note that the stdio library covers I/O for the console, files, and strings.
The C++ way of doing I/O is through objects derived from the istream
and ostream
classes, calling instance methods, using the overloaded <<
and >>
operators, and passing option objects into the overloaded operators. The functionality of C’s stdio is covered by multiple C++ headers such as iostream, fstream, sstream.
The RNG library of C is small, making it easy but weak at the same time. There is only one global generator state. srand()
has a rather small range for a seed. RAND_MAX
is often defined as 2 15 −1 or 2 31 −1, which makes it painful to generate large numbers (such as uint64) or double-precision floating-point numbers.
The RNG library of C++ is simultaneously fancy and intimidating. Each RNG is a separate object, and can be chosen from multiple implementations – linear congruential, Mersenne Twister, hardware RNG, etc. To generate a random number, you have to first define a distribution – such as integers in the range [ a , b ] or a Boolean with probability p – then call the distribution with the generator, i.e. double val = dist(gen);
.
In C, the closest mechanism to modern exception handling is the pair of functions setjmp()
and longjmp()
. Otherwise, exceptional situations are conveyed through function return values, global status code variables/functions, or by signals.
The C++ exception mechanism with try
, catch
, and throw
is used in many other languages. try
blocks can be nested, and different catch
blocks are used to catch different types of values that are thrown.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK